No, we are running on 4 4-core machines. On Wed, Dec 30, 2015 at 2:10 PM, Abdel Hakim Deneche <[email protected]> wrote:
> are you running the tests on 32 core machines ? a different number of cores > affects how much memory is available for the sort > > On Wed, Dec 30, 2015 at 1:02 PM, Abdel Hakim Deneche < > [email protected]> > wrote: > > > The following tests are failing: > > > > > >> > /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q163_DRILL-2046.q > >> > >> > /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q177_DRILL-2046.q > >> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q174.q > >> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/ > >> /Functional/window_functions/multiple_partitions/q35.sql > >> > >> > /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q160_DRILL-1985.q > >> > >> > /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q162_DRILL-1985.q > >> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q165.q > >> /Functional/window_functions/multiple_partitions/q37.sql > >> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q171.q > >> > >> > /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q168_DRILL-2046.q > >> /Functional/window_functions/multiple_partitions/q36.sql > >> > >> > /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q159_DRILL-2046.q > >> /Functional/window_functions/multiple_partitions/q30.sql > >> > >> > /Functional/data-shapes/wide-columns/5000/1000rows/parquet/large/q157_DRILL-1985.q > >> /Functional/window_functions/multiple_partitions/q22.sql > > > > > > With one of the following errors: > > > > java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of > memory > >> while executing the query. > >> Caused by: org.apache.drill.exec.exception.OutOfMemoryException: > >> org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate > >> sv2, and not enough batchGroups to spill > >> at > >> > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:356) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > > > > > > or > > > > java.sql.SQLException: SYSTEM ERROR: DrillRuntimeException: Failed to > >> pre-allocate memory for SV. Existing recordCount*4 = 0, incoming batch > >> recordCount*4 = 3340 > >> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: > >> Failed to pre-allocate memory for SV. Existing recordCount*4 = 0, > incoming > >> batch recordCount*4 = 3340 > >> at > >> > org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.add(SortRecordBatchBuilder.java:116) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > >> at > >> > org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:451) > >> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT] > > > > > > > > > > On Wed, Dec 30, 2015 at 12:42 PM, Jacques Nadeau <[email protected]> > > wrote: > > > >> I'll let Steven answer your question directly. > >> > >> FYI, we are running a regression suite that was forked from the MapR > repo > >> a > >> month or so ago because we had to fix a bunch of things to make it work > >> with Apache Hadoop. (There was a thread about this back then and we > >> haven't > >> yet figured out how to merge both suites.) It is possible that he had a > >> successful run but the failures are happening on items that you've > >> recently > >> added to your suite. > >> > >> It is also possible (likely?) that the configuration settings for our > >> regression clusters are not the same. > >> > >> -- > >> Jacques Nadeau > >> CTO and Co-Founder, Dremio > >> > >> On Wed, Dec 30, 2015 at 12:37 PM, Abdel Hakim Deneche < > >> [email protected] > >> > wrote: > >> > >> > Steven, > >> > > >> > were you able to successfully run the regression tests on the transfer > >> > patch ? I just tried and saw several queries running out of memory ! > >> > > >> > On Wed, Dec 30, 2015 at 11:46 AM, Abdel Hakim Deneche < > >> > [email protected] > >> > > wrote: > >> > > >> > > Created DRILL-4236 < > https://issues.apache.org/jira/browse/DRILL-4236> > >> to > >> > > keep track of this improvement. > >> > > > >> > > On Wed, Dec 30, 2015 at 11:01 AM, Jacques Nadeau < > [email protected]> > >> > > wrote: > >> > > > >> > >> Since the accounting changed (more accurate), the termination > >> condition > >> > >> for > >> > >> the sort operator will be different than before. In fact, this > likely > >> > will > >> > >> be sooner since our accounting is much larger than previously > (since > >> we > >> > >> correctly consider the entire allocation rather than simply the > used > >> > >> allocation). > >> > >> > >> > >> Hakim, > >> > >> Steven and I were discussing the need to update the ExternalSort > >> > operator > >> > >> to use the new allocator functionality to better manage its memory > >> > >> envelope. Would you be interested in working on this since you seem > >> to > >> > be > >> > >> working with that code the most? Basically, it used to be that > there > >> was > >> > >> no > >> > >> way the sort operator would be able to correctly detect a memory > >> > condition > >> > >> and so it jumped through a bunch of hoops to try to figure out the > >> > >> termination condition.With the transfer accounting in place, this > >> code > >> > can > >> > >> be greatly simplified to just use the current operator memory > >> > allocation. > >> > >> > >> > >> -- > >> > >> Jacques Nadeau > >> > >> CTO and Co-Founder, Dremio > >> > >> > >> > >> On Wed, Dec 30, 2015 at 10:48 AM, rahul challapalli < > >> > >> [email protected]> wrote: > >> > >> > >> > >> > I installed the latest master and ran this query. So > >> > >> > planner.memory.max_query_memory_per_node should have been the > >> default > >> > >> > value. I switched back to 1.4.0 branch and this query completed > >> > >> > successfully. > >> > >> > > >> > >> > On Wed, Dec 30, 2015 at 10:37 AM, Abdel Hakim Deneche < > >> > >> > [email protected] > >> > >> > > wrote: > >> > >> > > >> > >> > > Rahul, > >> > >> > > > >> > >> > > How much memory was assigned to the sort operator ( > >> > >> > > planner.memory.max_query_memory_per_node) ? > >> > >> > > > >> > >> > > On Wed, Dec 30, 2015 at 9:54 AM, rahul challapalli < > >> > >> > > [email protected]> wrote: > >> > >> > > > >> > >> > > > I am seeing an OOM error while executing a simple CTAS > query. I > >> > >> raised > >> > >> > > > DRILL-4324 for this. The query mentioned in the JIRA used to > >> > >> complete > >> > >> > > > successfully without any issue prior to 1.5. Any idea what > >> could > >> > >> have > >> > >> > > > caused the regression? > >> > >> > > > > >> > >> > > > - Rahul > >> > >> > > > > >> > >> > > > >> > >> > > > >> > >> > > > >> > >> > > -- > >> > >> > > > >> > >> > > Abdelhakim Deneche > >> > >> > > > >> > >> > > Software Engineer > >> > >> > > > >> > >> > > <http://www.mapr.com/> > >> > >> > > > >> > >> > > > >> > >> > > Now Available - Free Hadoop On-Demand Training > >> > >> > > < > >> > >> > > > >> > >> > > >> > >> > >> > > >> > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > > > >> > > > >> > > > >> > > -- > >> > > > >> > > Abdelhakim Deneche > >> > > > >> > > Software Engineer > >> > > > >> > > <http://www.mapr.com/> > >> > > > >> > > > >> > > Now Available - Free Hadoop On-Demand Training > >> > > < > >> > > >> > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > >> > > > >> > > > >> > > >> > > >> > > >> > -- > >> > > >> > Abdelhakim Deneche > >> > > >> > Software Engineer > >> > > >> > <http://www.mapr.com/> > >> > > >> > > >> > Now Available - Free Hadoop On-Demand Training > >> > < > >> > > >> > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > >> > > > >> > > >> > > > > > > > > -- > > > > Abdelhakim Deneche > > > > Software Engineer > > > > <http://www.mapr.com/> > > > > > > Now Available - Free Hadoop On-Demand Training > > < > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > > > > > > > -- > > Abdelhakim Deneche > > Software Engineer > > <http://www.mapr.com/> > > > Now Available - Free Hadoop On-Demand Training > < > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > >
