Yes, my thought exactly. Kindly let me know if you need any help to port in pyspark.
On Mon, Nov 6, 2017 at 8:54 AM, Nicolas Paris <nipari...@gmail.com> wrote: > Le 05 nov. 2017 à 22:46, ayan guha écrivait : > > Thank you for the clarification. That was my understanding too. However > how to > > provide the upper bound as it changes for every call in real life. For > example > > it is not required for sqoop. > > True. AFAIK sqoop begins with doing a > "select min(column_split),max(column_split) > from () as query;" > and then splits the result. > > I was thinking doing the same with wrapper with spark jdbc that would > infer the number partition, and the upper/lower bound itself. > > -- Best Regards, Ayan Guha