Re: get method guid prefix for file parts for write

2020-09-25 Thread gpongracz
What Nick said was correct. What I should also state is that I am using python spark variant in this case not the scala. I am looking to use the guid prefix of part-0 to prevent a race condition by using a s3 waiter for the part to appear, but to achieve this, I need to know the guid value in

Re: get method guid prefix for file parts for write

2020-09-25 Thread gpongracz
I should add that I tried using a waiter on the _SUCCESS file but it did not prove successful as due to its small size compared to the part-0 file it seems to be appearing before the part-0 file in s3, even though it was written afterwards. -- Sent from:

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2020-09-25 Thread Jungtaek Lim
Thanks Etienne! Yeah I forgot to say nice talking with you again. And sorry I forgot to send the reply (was in draft). Regarding investment in SS, well, unfortunately I don't know - I'm just an individual. There might be various reasons to do so, most probably "priority" among the stuff. There's

Re: get method guid prefix for file parts for write

2020-09-25 Thread Nicholas Chammas
I think what George is looking for is a way to determine ahead of time the partition IDs that Spark will use when writing output. George, I believe this is an example of what you're looking for:

Re: get method guid prefix for file parts for write

2020-09-25 Thread EveLiao
If I understand your problem correctly, the prefix you provided is actually "-" + UUID. You can get it by uuid generator like https://docs.python.org/3/library/uuid.html#uuid.uuid4. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/