[ https://issues.apache.org/jira/browse/PHOENIX-1779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Samarth Jain resolved PHOENIX-1779. ----------------------------------- Resolution: Fixed Fix Version/s: 4.4.0 5.0.0 Pushed to 4.4 and master branches. Thanks for the review [~jamestaylor]. > Parallelize fetching of next batch of records for scans corresponding to > queries with no order by > -------------------------------------------------------------------------------------------------- > > Key: PHOENIX-1779 > URL: https://issues.apache.org/jira/browse/PHOENIX-1779 > Project: Phoenix > Issue Type: Improvement > Reporter: Samarth Jain > Assignee: Samarth Jain > Fix For: 5.0.0, 4.4.0 > > Attachments: PHOENIX-1779.patch, PHOENIX-1779_v2.patch, > PHOENIX-1779_v3.patch, wip.patch, wip3.patch, wipwithsplits.patch > > > Today in Phoenix we parallelize the first execution of scans i.e. we load > only the first batch of records up to the scan's cache size in parallel. > Loading of subsequent batches of records in scanners is essentially serial. > This could be improved especially for queries, including the ones with no > order by clauses, that do not need any kind of merge sort on the client. > This could also potentially improve the performance of UPSERT SELECT > statements that load data from one table and insert into another. One such > use case being creating immutable indexes for tables that already have data. > It could also potentially improve the performance of our MapReduce solution > for bulk loading data by improving the speed of the loading/mapping phase. -- This message was sent by Atlassian JIRA (v6.3.4#6332)