RE: Creating a Tableau extracts with Drill 1.12 uses unlimited memory

Kunal Khatua Fri, 02 Feb 2018 12:15:17 -0800

Hi Francis

I did some digging in at the moment and learnt something interesting. Spoke 
with an engineer who has worked on the Drill C++ client (the underlying client 
code that has the ODBC wrapper). I’ll try to explain the cause of the problem 
as best as I can:

From what I understand is that your query is trying to extract complex data 
(data has that map datatype in one or more columns).

The query is reading this complex Parquet data out of S3 in 3 of the ways.

  1.  WebUI --> Drill runs out of memory (with old memory settings. Not sure if 
u retried this!)
  2.  CTAS (CreateTableAsSelect) --> Runs to completion
  3.  Tableau --> Drill runs out of memory

Let me explain to you how the 3 of them differ.

  1.  WebUI essentially creates a ResultSet listener within the Drillbit 
itself. This was something introduced in 1.11.0 to allow for efficient thread 
management, because until then, we’d create an embedded DrillClient to act as a 
proxy for the web-browser.
However, we still need to do pagination, so that as browser consumes the data, 
Drill can reuse the memory to pull in more data.  In the absence of this, the 
resultset is completely held in memory. Pagination is a low priority because we 
don’t expect users to use it for large sets of data (thought REST APIs are a 
good candidate for supporting the need).
  2.  CTAS basically reads from source and writes it to a temporary space 
(tmp). This goes cleanly because it runs in parallel. No ODBC/JDBC involved 
here.
  3.  Tableau uses the ODBC driver to query data… and (this I didn’t know) does 
not understand Object (map) datatypes. It’s a problem with ODBC and not 
Tableau. So, to allow tools like Tableau to consume such data, Drill tries to 
convert this into JSON documents.
Now, Drill (currently) moves data in fixed batches of 64K records. The result 
is that the ComplexToJson (00-01) operator (see in the OperatorOverview and 
PhysicalPlan) is trying to do these large number of string conversions. And by 
nature of the ODBC client, we are doing this at the root major fragment (which 
is a single thread doing this conversion). That is why the memory is going up 
so high. We are hoping to have an adaptable batch size mechanism in a future 
Drill release, but I am not sure it necessarily will resolve your issue. That 
is partly because I don’t know how big of a JSON document is being generated 
and the code creating the JSON doc might be memory hungry.

So, the options you have are limited at the moment.

  1.  Partition the data and query the partitions separately via Tableau…. or 
run the query using LIMIT
  2.  Use a JDBC based BI tool, since JDBC is able to manage complex data.

Option 1 defeats the purpose of using Drill across a large dataset, but I hope 
you understand that the issue is more to do with the lack of ODBC to handle 
complex data.
Option 2 should work best. I’m assuming Tableau has JDBC sources options.

Hope this helps.

~ Kunal

From: Francis McGregor-Macdonald [mailto:[email protected]]
Sent: Monday, January 29, 2018 4:00 PM
To: [email protected]
Subject: Fwd: Creating a Tableau extracts with Drill 1.12 uses unlimited memory

Thanks Andries and Kunal,

Query profile attached along with log and out from master node. I included the 
same select into a create table (csv) which works as I would expect (I have 
attached as a .zip as messages keep failing due to size, hopefully that works).

In the S3 bucket how many parquet files are present that you are trying to 
query with Drill?
>> A few thousand in the bucket, a single file under the query (36 fragments)

The View with Select * from xxxxx is typically a bad idea when dealing with BI 
tools.
>> The * is generated by Tableau, the underlying views are explicit. My bad for 
>> suggesting that earlier, I meant select all columns, they are listed in 
>> select statement in the view.

For parquet data you may consider updating the metadata cache
>> Noted, I did that and it didn't seem to change the result. I also rebuilt 
>> the file (it is created using Drill) a few times in case or corruption.

(I am having trouble getting messages through due to size)

Regards,
Francis

On Tue, Jan 30, 2018 at 4:29 AM, Andries Engelbrecht 
<[email protected]<mailto:[email protected]>> wrote:
Also a couple of other things to consider (may not directly fix your current 
issue, but good practices).

1) In the S3 bucket how many parquet files are present that you are trying to 
query with Drill? (Other have asked for the Drill plans as well, as it will 
also be good to see how many minor fragments the Drill cluster is using to read 
the S3 data).
2) The View with Select * from xxxxx is typically a bad idea when dealing with 
BI tools. Drill is based on Schema discovery and by doing this you are forcing 
Drill to discover the Schema every time a tool requests metadata, and you also 
make query planning much harder and more expensive.
3) For parquet data you may consider updating the metadata cache, help to 
improve query planning speed, metadata operations, etc. Also a good check to 
see if the data is healthy.

You may consider looking at some pointers here .
https://community.mapr.com/community/exchange/blog/2017/01/25/drill-best-practices-for-bi-and-analytical-tools

--Andries

On 1/28/18, 6:18 PM, "Kunal Khatua" <[email protected]<mailto:[email protected]>> 
wrote:

    Hi Francis

    Looks like the logs didn’t get attached.

    As for the physical plan, ignore the “memory = 0.0” ; because it is a 
physical plan and not the actual executed query’s profile.

    What you want to do to debug the possible area where the memory is being 
consumed is in the query’s profile page.

    This will tell you which is the longest running major fragment:

http://<hostname>:8047/profiles/<queryId>#fragment-overview<https://urldefense.proofpoint.com/v2/url?u=http-3A__-253chostname-253e-3A8047_profiles_-253cqueryId-253e-23fragment-2Doverview&d=DwIGaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=Q3Oz5l4W5TvDHNLpOqMYE2AgtKWFE937v89GEHyOVDU&m=tIURVi72-_36So0nsCefh5MA-mOIGlH1cc7sltau_LE&s=3uyVn3LfMZcx8gIK3Ji3-FUNu5YHfF95t-bcVlSw_JU&e=>

    You’re looking for the Max Peak Memory column to see which ones are holding 
the most memory.

http://<hostname>:8047/profiles/<queryId>#operator-overview<https://urldefense.proofpoint.com/v2/url?u=http-3A__-253chostname-253e-3A8047_profiles_-253cqueryId-253e-23operator-2Doverview&d=DwIGaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=Q3Oz5l4W5TvDHNLpOqMYE2AgtKWFE937v89GEHyOVDU&m=tIURVi72-_36So0nsCefh5MA-mOIGlH1cc7sltau_LE&s=WHrGV2670nWERkG-w8vw_mw-6qnX-aHRJxeZW8vJjJQ&e=>

    And this will tell you the statistics for the various operators’ 
consumption of memory.

    For running such a query through the WebUI is generally not recommended, 
because (by design), the WebUI creates a ResultSetListener in the Drillbit 
where it holds all the records before sending anything back. 
(https://urldefense.proofpoint.com/v2/url?u=https-3A__issues.apache.org_jira_browse_DRILL-2D6050&d=DwIGaQ&c=cskdkSMqhcnjZxdQVpwTXg&r=Q3Oz5l4W5TvDHNLpOqMYE2AgtKWFE937v89GEHyOVDU&m=tIURVi72-_36So0nsCefh5MA-mOIGlH1cc7sltau_LE&s=cgP4XNEuVrjWyzrt2dbKEP6oC2dDKEYdr8jnGO-kJkc&e=)

    However, for Tableau, the DrillClient should have been consuming the 
records, so I’m wondering why Drillbit should run out of memory .

    Could you share the following for both scenarios (the Web UI and when 
running via Tableau)?

      1.  The profiles

      2.  The Drillbit logs (drillbit.out, drillbit.log, etc)

    Thanks

    Kunal

    From: Francis McGregor-Macdonald 
[mailto:[email protected]<mailto:[email protected]>]

    Sent: Sunday, January 28, 2018 2:49 PM

    To: [email protected]<mailto:[email protected]>

    Subject: Re: Fwd: Creating a Tableau extracts with Drill 1.12 uses 
unlimited memory

    Hi all,

    A physical plan attached ... all memory appears to be 0.0 which seems odd?

    Thanks

    On Sun, Jan 28, 2018 at 10:37 PM, Francis McGregor-Macdonald 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
 wrote:

    And with logs as attachments.

    On Sun, Jan 28, 2018 at 9:40 PM, Francis McGregor-Macdonald 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
 wrote:

    Thanks Paul and Kunal,

    I think I have the right information now. With Paul's changes (and fixing 
up a zoo.cfg error) it isn't crashing, rather failing. Logs attached, still 
blowing past memory limits. It does the same thing when re-running the query 
from the web console so presumably its not actually Tableau related despite me 
first generating it that way.

    Thanks.

    On Sat, Jan 27, 2018 at 1:15 PM, Francis McGregor-Macdonald 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
 wrote:

    Thanks Paul,

    I will update with your suggested memory allocations also and retry.

    Zookeeper crashed too which might explain more? I have attached the logs 
from Zookeeper too.

    Thanks

    On Sat, Jan 27, 2018 at 6:45 AM, Paul Rogers 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
 wrote:

    Hi Francis,

    Thanks much for the log. The log shows running a query, then immediately 
shows entries that occur when starting Drill. I'm guessing that Drill literally 
crashed at this point? This is more severe than the usual error in which a 
query exhausts memory.

    Some general observations. The Drill memory is 60 GB, but system memory is 
61 GB. Perhaps try dropping total Drill memory some to give the OS and other 
tasks more headroom. For a SELECT * memory, Drill needs far less than what you 
have, so maybe try giving Drill 48 GB total.

    Then, Drill needs direct memory much more than heap. So, maybe give Drill 
39 GB direct, 8 GB heap and 1 GB (the default) for code cache. These settings 
are in drill-env.sh.

    Kunal, you have more experience with these issues. Can you make additional 
suggestions by looking at the log?

    Thanks,

    - Paul

    On Thursday, January 25, 2018, 10:20:29 PM PST, Francis McGregor-Macdonald 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
 wrote:

    Hi all,

    I am guessing that each of your EMR nodes are quite large? EMR nodes are: 
r4.2xlarge ('vcpu': 8, 'memory': 61)

    Property "planner.width.max_per_node" is set to = 6

    What is the system memory and what are the allocations for heap and direct?

    System Memory: 61GB (EMR nodes above)

    drill_mem_heap: 12G

    drill_mem_max: 48G

    The view is simple: SELECT * FROM s3://myparquet.parquet (14GB)

    planner.memory.max_query_memor y_per_node = 10479720202

    Drillbit.log attached (I think I have the correct selection included).

    Thanks

    On Fri, Jan 26, 2018 at 2:41 PM, Kunal Khatua 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>
 wrote:

    What is the system memory and what are the allocations for heap and direct? 
The memory crash might be occurring due to insufficient heap. The limits 
parameter applies to the direct memory and not Heap.

    Can you share details in the logs from the crash?

    -----Original Message-----

    From: Timothy Farkas 
[mailto:[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>]

    Sent: Thursday, January 25, 2018 2:58 PM

    To: 
[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>

    Subject: Re: Creating a Tableau extracts with Drill 1.12 uses unlimited 
memory

    Hi Francis,

    I am guessing that each of your EMR nodes are quite large (32 or 64 vcpus). 
On large machines Drill's planner over parallelizes and over allocates memory. 
There is a property "planner.width.max_per_node" which limits the number of 
operators that can simultaneously execute on a Drillbit for a query. If you 
configure the width per node to something like 5 or 10 (you may have to play 
around with it) things should start working.

    Thanks,

    Tim

    ______________________________ __

    From: Francis McGregor-Macdonald 
<[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>>

    Sent: Thursday, January 25, 2018 1:58:22 PM

    To: 
[email protected]<mailto:[email protected]><mailto:[email protected]<mailto:[email protected]>>

    Subject: Creating a Tableau extracts with Drill 1.12 uses unlimited memory

    Creating a creating a Tableau (with 10.3, 10.5 desktop) extract from a 
Drill (1.12 on EMR) cluster memory appears not to adhere to the limits set by 
planner.memory.max_query_memor y_per_node.

    The extract query consumes all memory and then crashes drill.

    Running the same query as a create table memory behaves as expected.

    The query complexity is trivial:

    select * from view only a single parquet with no calculated fields.

    Has anyone else observed this behavior?

RE: Creating a Tableau extracts with Drill 1.12 uses unlimited memory

Reply via email to