are you running the tests on 32 core machines ? a different number of cores
affects how much memory is available for the sort

On Wed, Dec 30, 2015 at 1:02 PM, Abdel Hakim Deneche <[email protected]>
wrote:

> The following tests are failing:
>
>
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q163_DRILL-2046.q
>>
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q177_DRILL-2046.q
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q174.q
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/
>> /Functional/window_functions/multiple_partitions/q35.sql
>>
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q160_DRILL-1985.q
>>
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q162_DRILL-1985.q
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q165.q
>> /Functional/window_functions/multiple_partitions/q37.sql
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q171.q
>>
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q168_DRILL-2046.q
>> /Functional/window_functions/multiple_partitions/q36.sql
>>
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/q159_DRILL-2046.q
>> /Functional/window_functions/multiple_partitions/q30.sql
>>
>> /Functional/data-shapes/wide-columns/5000/1000rows/parquet/large/q157_DRILL-1985.q
>> /Functional/window_functions/multiple_partitions/q22.sql
>
>
> With one of the following errors:
>
> java.sql.SQLException: RESOURCE ERROR: One or more nodes ran out of memory
>> while executing the query.
>> Caused by: org.apache.drill.exec.exception.OutOfMemoryException:
>> org.apache.drill.exec.exception.OutOfMemoryException: Unable to allocate
>> sv2, and not enough batchGroups to spill
>>         at
>> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:356)
>> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>
>
> or
>
> java.sql.SQLException: SYSTEM ERROR: DrillRuntimeException: Failed to
>> pre-allocate memory for SV. Existing recordCount*4 = 0, incoming batch
>> recordCount*4 = 3340
>> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException:
>> Failed to pre-allocate memory for SV. Existing recordCount*4 = 0, incoming
>> batch recordCount*4 = 3340
>>         at
>> org.apache.drill.exec.physical.impl.sort.SortRecordBatchBuilder.add(SortRecordBatchBuilder.java:116)
>> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>>         at
>> org.apache.drill.exec.physical.impl.xsort.ExternalSortBatch.innerNext(ExternalSortBatch.java:451)
>> ~[drill-java-exec-1.5.0-SNAPSHOT.jar:1.5.0-SNAPSHOT]
>
>
>
>
> On Wed, Dec 30, 2015 at 12:42 PM, Jacques Nadeau <[email protected]>
> wrote:
>
>> I'll let Steven answer your question directly.
>>
>> FYI, we are running a regression suite that was forked from the MapR repo
>> a
>> month or so ago because we had to fix a bunch of things to make it work
>> with Apache Hadoop. (There was a thread about this back then and we
>> haven't
>> yet figured out how to merge both suites.) It is possible that he had a
>> successful run but the failures are happening on items that you've
>> recently
>> added to your suite.
>>
>> It is also possible (likely?) that the configuration settings for our
>> regression clusters are not the same.
>>
>> --
>> Jacques Nadeau
>> CTO and Co-Founder, Dremio
>>
>> On Wed, Dec 30, 2015 at 12:37 PM, Abdel Hakim Deneche <
>> [email protected]
>> > wrote:
>>
>> > Steven,
>> >
>> > were you able to successfully run the regression tests on the transfer
>> > patch ? I just tried and saw several queries running out of memory !
>> >
>> > On Wed, Dec 30, 2015 at 11:46 AM, Abdel Hakim Deneche <
>> > [email protected]
>> > > wrote:
>> >
>> > > Created DRILL-4236 <https://issues.apache.org/jira/browse/DRILL-4236>
>> to
>> > > keep track of this improvement.
>> > >
>> > > On Wed, Dec 30, 2015 at 11:01 AM, Jacques Nadeau <[email protected]>
>> > > wrote:
>> > >
>> > >> Since the accounting changed (more accurate), the termination
>> condition
>> > >> for
>> > >> the sort operator will be different than before. In fact, this likely
>> > will
>> > >> be sooner since our accounting is much larger than previously (since
>> we
>> > >> correctly consider the entire allocation rather than simply the used
>> > >> allocation).
>> > >>
>> > >> Hakim,
>> > >> Steven and I were discussing the need to update the ExternalSort
>> > operator
>> > >> to use the new allocator functionality to better manage its memory
>> > >> envelope. Would you be interested in working on this since you seem
>> to
>> > be
>> > >> working with that code the most? Basically, it used to be that there
>> was
>> > >> no
>> > >> way the sort operator would be able to correctly detect a memory
>> > condition
>> > >> and so it jumped through a bunch of hoops to try to figure out the
>> > >> termination condition.With the transfer accounting in place, this
>> code
>> > can
>> > >> be greatly simplified to just use the current operator memory
>> > allocation.
>> > >>
>> > >> --
>> > >> Jacques Nadeau
>> > >> CTO and Co-Founder, Dremio
>> > >>
>> > >> On Wed, Dec 30, 2015 at 10:48 AM, rahul challapalli <
>> > >> [email protected]> wrote:
>> > >>
>> > >> > I installed the latest master and ran this query. So
>> > >> > planner.memory.max_query_memory_per_node should have been the
>> default
>> > >> > value. I switched back to 1.4.0 branch and this query completed
>> > >> > successfully.
>> > >> >
>> > >> > On Wed, Dec 30, 2015 at 10:37 AM, Abdel Hakim Deneche <
>> > >> > [email protected]
>> > >> > > wrote:
>> > >> >
>> > >> > > Rahul,
>> > >> > >
>> > >> > > How much memory was assigned to the sort operator (
>> > >> > > planner.memory.max_query_memory_per_node) ?
>> > >> > >
>> > >> > > On Wed, Dec 30, 2015 at 9:54 AM, rahul challapalli <
>> > >> > > [email protected]> wrote:
>> > >> > >
>> > >> > > > I am seeing an OOM error while executing a simple CTAS query. I
>> > >> raised
>> > >> > > > DRILL-4324 for this. The query mentioned in the JIRA used to
>> > >> complete
>> > >> > > > successfully without any issue prior to 1.5. Any idea what
>> could
>> > >> have
>> > >> > > > caused the regression?
>> > >> > > >
>> > >> > > > - Rahul
>> > >> > > >
>> > >> > >
>> > >> > >
>> > >> > >
>> > >> > > --
>> > >> > >
>> > >> > > Abdelhakim Deneche
>> > >> > >
>> > >> > > Software Engineer
>> > >> > >
>> > >> > >   <http://www.mapr.com/>
>> > >> > >
>> > >> > >
>> > >> > > Now Available - Free Hadoop On-Demand Training
>> > >> > > <
>> > >> > >
>> > >> >
>> > >>
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> > >
>> > >
>> > > --
>> > >
>> > > Abdelhakim Deneche
>> > >
>> > > Software Engineer
>> > >
>> > >   <http://www.mapr.com/>
>> > >
>> > >
>> > > Now Available - Free Hadoop On-Demand Training
>> > > <
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > >
>> > >
>> >
>> >
>> >
>> > --
>> >
>> > Abdelhakim Deneche
>> >
>> > Software Engineer
>> >
>> >   <http://www.mapr.com/>
>> >
>> >
>> > Now Available - Free Hadoop On-Demand Training
>> > <
>> >
>> http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available
>> > >
>> >
>>
>
>
>
> --
>
> Abdelhakim Deneche
>
> Software Engineer
>
>   <http://www.mapr.com/>
>
>
> Now Available - Free Hadoop On-Demand Training
> <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
>



-- 

Abdelhakim Deneche

Software Engineer

  <http://www.mapr.com/>


Now Available - Free Hadoop On-Demand Training
<http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>

Reply via email to