Re: Sorting in runners

2018-08-07 Thread Rui Wang
Thanks! I will try to figure why GSC is disallowed and share it if I find anything. -Rui On Tue, Aug 7, 2018 at 12:00 PM Lukasz Cwik wrote: > IMO, going with SortValues is the right way to go. The idea is that > runners can always replace the SortValues PTransform with their own > optimized

Re: Sorting in runners

2018-08-07 Thread Lukasz Cwik
IMO, going with SortValues is the right way to go. The idea is that runners can always replace the SortValues PTransform with their own optimized variant. As you have already pointed out, the default inmemory implementation has strict limitations. I would suggest going with the inmemory version

Sorting in runners

2018-08-07 Thread Rui Wang
Hi Community, I am trying to support ORDER BY in BeamSQL (currently in global window only, see BEAM-5064). In order to do so, I need to sort PCollection. The scale of dataset that ORDER BY works on is unknown. It might be up to TB sized dataset if BeamSQL runs on some benchmarks. But in the most