Re: Hangout Discussion Topics for 04-16-2019

2019-04-24 Thread Padma Penumarthy
You can look at RecordBatchMemoryManager.java and follow one of the operator code (like flatten) to see how this was done. Thanks Padma On Wed, Apr 24, 2019 at 12:00 PM Paul Rogers wrote: > Hi Igor, > > Thanks for the recap. You asked about vector allocation. Here is where I > think things

Re: [ANNOUNCE] New Committer: Karthikeyan Manivannan

2018-12-07 Thread Padma Penumarthy
Congrats Karthik. Thanks Padma On Fri, Dec 7, 2018 at 1:33 PM Paul Rogers wrote: > Congrats Karthik! > > - Paul > > Sent from my iPhone > > > On Dec 7, 2018, at 11:12 AM, Abhishek Girish wrote: > > > > Congratulations Karthik! > > > >> On Fri, Dec 7, 2018 at 11:11 AM Arina Ielchiieva >

Re: November Apache Drill board report

2018-11-07 Thread Padma Penumarthy
r the report. Could more folks please > review the report? > > Kind regards, > Arina > > On Fri, Nov 2, 2018 at 8:33 PM Padma Penumarthy < > penumarthy.pa...@gmail.com> > wrote: > > > Hi Arina, > > > > Can you add batch sizing (for bunch of operator

Re: November Apache Drill board report

2018-11-02 Thread Padma Penumarthy
Hi Arina, Can you add batch sizing (for bunch of operators and parquet reader) also ? Thanks Padma On Fri, Nov 2, 2018 at 2:55 AM Arina Ielchiieva wrote: > Sure, let's mention. > Updated the report. > > = > > ## Description: > - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL

Re: [ANNOUNCE] New Committer: Hanumath Rao Maduri

2018-11-01 Thread Padma Penumarthy
Congratulations Hanu. Thanks Padma On Thu, Nov 1, 2018 at 7:44 PM weijie tong wrote: > Congratulations, Hanu! > > On Fri, Nov 2, 2018 at 8:22 AM Robert Hou wrote: > > > Congratulations, Hanu. Thanks for contributing to Drill. > > > > --Robert > > > > On Thu, Nov 1, 2018 at 4:06 PM Jyothsna

Re: msgpack reading schema files checksum error

2018-10-30 Thread Padma Penumarthy
How are you modifying the file manually ? Are you copying to local file system, make changes and copying back to HDFS ? Thanks Padma On Tue, Oct 30, 2018 at 5:45 PM Jean-Claude Cote wrote: > I'm writing a msgpack reader which supports schema validation. The msgpack > reader is able to

Re: [ANNOUNCE] New Committer: Gautam Parai

2018-10-22 Thread Padma Penumarthy
Congratulations Gautam. Thanks Padma On Mon, Oct 22, 2018 at 7:25 AM Arina Ielchiieva wrote: > The Project Management Committee (PMC) for Apache Drill has invited Gautam > Parai to become a committer, and we are pleased to announce that he has > accepted. > > Gautam has become a contributor

Re: [ANNOUNCE] New Committer: Chunhui Shi

2018-09-28 Thread Padma Penumarthy
Congratulations Chunhui. Thanks Padma On Fri, Sep 28, 2018 at 2:17 AM Arina Ielchiieva wrote: > The Project Management Committee (PMC) for Apache Drill has invited Chunhui > Shi to become a committer, and we are pleased to announce that he has > accepted. > > Chunhui Shi has become a

Re: [ANNOUNCE] New PMC member: Boaz Ben-Zvi

2018-08-17 Thread Padma Penumarthy
Congratulations Boaz. Thanks Padma On Fri, Aug 17, 2018 at 2:33 PM, Robert Wu wrote: > Congratulations, Boaz! > > Best regards, > > Rob > > -Original Message- > From: Abhishek Girish > Sent: Friday, August 17, 2018 2:17 PM > To: dev > Subject: Re: [ANNOUNCE] New PMC member: Boaz

Re: Metadata management improvement

2018-08-03 Thread Padma Penumarthy
hat the default > should be. > > Concerning the overall change: the introduction of TTL, can we submit a > design document, or would you prefer to invest on the longer term meta data > repository? > > Regards, Joel > > > On Thu, Aug 2, 2018 at 6:28 AM, Padma Penumarthy &

Re: Metadata management improvement

2018-08-01 Thread Padma Penumarthy
of TTL, I think having a system/session option that will let us > skip this check altogether would be a good thing to have. So, if we know we > are not adding new data, we can set that option." > I would see the need to set TTL per Table. Since different tables will have > different upda

Re: [ANNOUNCE] New PMC Chair of Apache Drill

2018-07-18 Thread Padma Penumarthy
Arina, Congratulations and best wishes. Thanks Padma On Wed, Jul 18, 2018 at 4:54 PM, Bridget Bevens wrote: > Congratulations, Arina!!! > > On Wed, Jul 18, 2018 at 3:20 PM, Parth Chandra wrote: > > > Congratulations > > > > On Wed, Jul 18, 2018 at 3:14 PM, Kunal Khatua wrote: > > > > >

Re: Metadata management improvement

2018-07-13 Thread Padma Penumarthy
Hi Joel, This is my understanding: We have list of all directories (i.e. all subdirectories and their subdirectories etc.) in the metadata cache file of each directory. We go through that list of directories and check directory modification time against modification time of metadata cache file in

Re: [DISCUSS] 1.14.0 release

2018-07-06 Thread Padma Penumarthy
If possible, please include PR 1363 in this release so we can complete our batch sizing work (except for exchange operators) DRILL-6549: batch sizing for nested loop join. Thanks Padma On Fri, Jul 6, 2018 at 2:51 PM, Pritesh Maker wrote: > Here is the release 1.14 dashboard

[jira] [Created] (DRILL-6549) batch sizing for nested loop join

2018-06-27 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6549: --- Summary: batch sizing for nested loop join Key: DRILL-6549 URL: https://issues.apache.org/jira/browse/DRILL-6549 Project: Apache Drill Issue Type

Re: Drill Hangout tomorrow 06/26

2018-06-26 Thread Padma Penumarthy
Here is the link to the document. Any feedback/comments welcome. https://docs.google.com/document/d/1Z-67Y_KNcbA2YYWCHEwf2PUEmXRPWSXsw-CHnXW_98Q/edit?usp=sharing Thanks Padma On Jun 26, 2018, at 12:12 PM, Aman Sinha mailto:amansi...@gmail.com>> wrote: Hangout attendees on 06/26: Padma,

Re: [ANNOUNCE] New PMC member: Vitalii Diravka

2018-06-26 Thread Padma Penumarthy
Congrats Vitalii. Thanks Padma > On Jun 26, 2018, at 6:14 PM, Vlad Rozov wrote: > > Congratulations Vitalii! > > Thank you, > > Vlad > > On 6/26/18 17:11, Paul Rogers wrote: >> Congratulations Vitalii! >> - Paul >> >> >> On Tuesday, June 26, 2018, 11:12:16 AM PDT, Aman Sinha >>

[jira] [Created] (DRILL-6539) Record count not set for this vector container error

2018-06-25 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6539: --- Summary: Record count not set for this vector container error Key: DRILL-6539 URL: https://issues.apache.org/jira/browse/DRILL-6539 Project: Apache Drill

[jira] [Created] (DRILL-6537) Limit the batch size for buffering operators based on how much memory they get

2018-06-25 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6537: --- Summary: Limit the batch size for buffering operators based on how much memory they get Key: DRILL-6537 URL: https://issues.apache.org/jira/browse/DRILL-6537

[jira] [Resolved] (DRILL-6427) outputBatchSize is missing from the DEBUG output for HashJoinBatch operator

2018-06-18 Thread Padma Penumarthy (JIRA)
[ https://issues.apache.org/jira/browse/DRILL-6427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Padma Penumarthy resolved DRILL-6427. - Resolution: Fixed > outputBatchSize is missing from the DEBUG output for HashJoinBa

[jira] [Created] (DRILL-6512) Remove unnecessary processing overhead from RecordBatchSizer

2018-06-17 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6512: --- Summary: Remove unnecessary processing overhead from RecordBatchSizer Key: DRILL-6512 URL: https://issues.apache.org/jira/browse/DRILL-6512 Project: Apache

Re: [ANNOUNCE] New Committer: Padma Penumarthy

2018-06-15 Thread Padma Penumarthy
Congratulations, Padma ! >>> >>> >>> On 6/15/2018 12:34:15 PM, Robert Wu wrote: >>> Congratulations, Padma! >>> >>> Best regards, >>> >>> Rob >>> >>> -----Original Message- >>> From:

[jira] [Created] (DRILL-6499) No need to calculate stdRowWidth for every batch by default

2018-06-15 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6499: --- Summary: No need to calculate stdRowWidth for every batch by default Key: DRILL-6499 URL: https://issues.apache.org/jira/browse/DRILL-6499 Project: Apache

Re: [DISCUSS] 1.14.0 release

2018-06-13 Thread Padma Penumarthy
I am planning to open couple of batch sizing PRs this week that I would like to get in. Thanks Padma > On Jun 13, 2018, at 11:59 AM, Vlad Rozov wrote: > > DRILL-6422: Update guava to 23.0 and shade it (PR in review) > DRILL-6353: Upgrade Parquet MR dependencies (ready-to-commit) > > Thank

[jira] [Created] (DRILL-6478) enhance debug logs for batch sizing

2018-06-07 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6478: --- Summary: enhance debug logs for batch sizing Key: DRILL-6478 URL: https://issues.apache.org/jira/browse/DRILL-6478 Project: Apache Drill Issue Type

[jira] [Resolved] (DRILL-6274) MergeJoin Memory Manager is still using Fragmentation Factor

2018-06-01 Thread Padma Penumarthy (JIRA)
[ https://issues.apache.org/jira/browse/DRILL-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Padma Penumarthy resolved DRILL-6274. - Resolution: Fixed > MergeJoin Memory Manager is still using Fragmentation Fac

[jira] [Resolved] (DRILL-6298) Add debug log for merge join batch sizing

2018-06-01 Thread Padma Penumarthy (JIRA)
[ https://issues.apache.org/jira/browse/DRILL-6298?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Padma Penumarthy resolved DRILL-6298. - Resolution: Fixed > Add debug log for merge join batch siz

Re: [ANNOUNCE] New Committer: Timothy Farkas

2018-05-25 Thread Padma Penumarthy
Congrats Tim. Thanks Padma > On May 25, 2018, at 12:15 PM, Vitalii Diravka > wrote: > > Good news! Congratulations, Timothy! > > Kind regards > Vitalii > > > On Fri, May 25, 2018 at 10:04 PM Arina Yelchiyeva < > arina.yelchiy...@gmail.com> wrote: > >> Congrats,

[jira] [Resolved] (DRILL-6411) Make batch memory sizing logs uniform across all operators

2018-05-17 Thread Padma Penumarthy (JIRA)
[ https://issues.apache.org/jira/browse/DRILL-6411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Padma Penumarthy resolved DRILL-6411. - Resolution: Fixed > Make batch memory sizing logs uniform across all operat

[jira] [Created] (DRILL-6411) Make batch memory sizing logs uniform across all operators

2018-05-11 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6411: --- Summary: Make batch memory sizing logs uniform across all operators Key: DRILL-6411 URL: https://issues.apache.org/jira/browse/DRILL-6411 Project: Apache Drill

[jira] [Created] (DRILL-6402) Repeated Value Vectors copyFrom methods are not updating the value count and writer index correctly for values vector

2018-05-10 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6402: --- Summary: Repeated Value Vectors copyFrom methods are not updating the value count and writer index correctly for values vector Key: DRILL-6402 URL: https

Re: [ANNOUNCE] New Committer: Sorabh Hamirwasia

2018-04-30 Thread Padma Penumarthy
Congrats Sorabh. Thanks Padma > On Apr 30, 2018, at 11:53 AM, Gautam Parai wrote: > > Congratulations Sorabh! Well deserved. > > > Gautam > > > From: rahul challapalli > Sent: Monday, April 30, 2018 11:09:24 AM

[jira] [Created] (DRILL-6356) batch sizing for union all

2018-04-25 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6356: --- Summary: batch sizing for union all Key: DRILL-6356 URL: https://issues.apache.org/jira/browse/DRILL-6356 Project: Apache Drill Issue Type

[jira] [Created] (DRILL-6343) bit vector copyFromSafe is not doing realloc

2018-04-19 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6343: --- Summary: bit vector copyFromSafe is not doing realloc Key: DRILL-6343 URL: https://issues.apache.org/jira/browse/DRILL-6343 Project: Apache Drill

Re: [DISCUSS] Regarding mutator interface

2018-04-11 Thread Padma Penumarthy
ated") are also containers. Non-repeated lists are containers for > unions, repeated-lists are containers for arrays. > Any setting should be done on the contained vectors. For lists, only the > offset vector is updated. > So, another question is: what is the generated code tr

Re: [DISCUSS] Regarding mutator interface

2018-04-11 Thread Padma Penumarthy
Can you explain how aggregation on complex type works (or supposed to work). Thanks Padma > On Apr 11, 2018, at 12:15 PM, Gautam Parai wrote: > > Hi all, > > > I am implementing a new aggregate function which also handles Complex types > (map and list). However, the

[jira] [Created] (DRILL-6310) limit batch size for hash aggregate

2018-04-05 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6310: --- Summary: limit batch size for hash aggregate Key: DRILL-6310 URL: https://issues.apache.org/jira/browse/DRILL-6310 Project: Apache Drill Issue Type

[jira] [Created] (DRILL-6307) Handle empty batches in record batch sizer correctly

2018-04-03 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6307: --- Summary: Handle empty batches in record batch sizer correctly Key: DRILL-6307 URL: https://issues.apache.org/jira/browse/DRILL-6307 Project: Apache Drill

[jira] [Created] (DRILL-6298) Add debug log for merge join batch sizing

2018-03-28 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6298: --- Summary: Add debug log for merge join batch sizing Key: DRILL-6298 URL: https://issues.apache.org/jira/browse/DRILL-6298 Project: Apache Drill Issue

[jira] [Created] (DRILL-6296) Add operator metrics for batch sizing for merge join

2018-03-28 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6296: --- Summary: Add operator metrics for batch sizing for merge join Key: DRILL-6296 URL: https://issues.apache.org/jira/browse/DRILL-6296 Project: Apache Drill

[jira] [Created] (DRILL-6284) Add operator metrics for batch sizing for flatten

2018-03-21 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6284: --- Summary: Add operator metrics for batch sizing for flatten Key: DRILL-6284 URL: https://issues.apache.org/jira/browse/DRILL-6284 Project: Apache Drill

batch sizing for operators

2018-03-13 Thread Padma Penumarthy
I have written a small design document explaining the batch sizing work we are doing for operators (other than scan). https://issues.apache.org/jira/browse/DRILL-6238 Some of this work has already been done in 1.13. flatten, merge join and external sort are changed to adhere to batch size

[jira] [Created] (DRILL-6238) Batch sizing for operators

2018-03-13 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6238: --- Summary: Batch sizing for operators Key: DRILL-6238 URL: https://issues.apache.org/jira/browse/DRILL-6238 Project: Apache Drill Issue Type: New

[jira] [Created] (DRILL-6236) batch sizing for hash join and hash aggregate

2018-03-12 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6236: --- Summary: batch sizing for hash join and hash aggregate Key: DRILL-6236 URL: https://issues.apache.org/jira/browse/DRILL-6236 Project: Apache Drill

[jira] [Created] (DRILL-6232) Vector initializer used for memory allocation in external sort is subject to aliasing

2018-03-11 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6232: --- Summary: Vector initializer used for memory allocation in external sort is subject to aliasing Key: DRILL-6232 URL: https://issues.apache.org/jira/browse/DRILL-6232

[jira] [Created] (DRILL-6231) Fix memory allocation for repeated list vector

2018-03-11 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6231: --- Summary: Fix memory allocation for repeated list vector Key: DRILL-6231 URL: https://issues.apache.org/jira/browse/DRILL-6231 Project: Apache Drill

[jira] [Created] (DRILL-6206) RowSetBuilder does not support repeated list

2018-03-02 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6206: --- Summary: RowSetBuilder does not support repeated list Key: DRILL-6206 URL: https://issues.apache.org/jira/browse/DRILL-6206 Project: Apache Drill

[jira] [Created] (DRILL-6205) Reduce memory consumption of testFlattenUpperLimit test

2018-03-02 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6205: --- Summary: Reduce memory consumption of testFlattenUpperLimit test Key: DRILL-6205 URL: https://issues.apache.org/jira/browse/DRILL-6205 Project: Apache Drill

Re: [DISCUSS] 1.13.0 release

2018-03-02 Thread Padma Penumarthy
I can look at failure in testFlattenUpperLimit. I added this test recently. Thanks Padma On Mar 2, 2018, at 8:19 AM, Volodymyr Tkach > wrote: testFlattenUpperLimit

[jira] [Created] (DRILL-6203) Repeated Map Vector does not give correct payload bytecount

2018-03-01 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6203: --- Summary: Repeated Map Vector does not give correct payload bytecount Key: DRILL-6203 URL: https://issues.apache.org/jira/browse/DRILL-6203 Project: Apache

Re: [ANNOUNCE] New Committer: Kunal Khatua

2018-02-27 Thread Padma Penumarthy
Congratulations Kunal ! Thanks Padma > On Feb 27, 2018, at 8:42 AM, Aman Sinha wrote: > > The Project Management Committee (PMC) for Apache Drill has invited Kunal > Khatua to become a committer, and we are pleased to announce that he > has accepted. > > Over the last

[jira] [Created] (DRILL-6184) Add batch sizing information to query profile

2018-02-22 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6184: --- Summary: Add batch sizing information to query profile Key: DRILL-6184 URL: https://issues.apache.org/jira/browse/DRILL-6184 Project: Apache Drill

[jira] [Created] (DRILL-6180) Use System Option "output_batch_size" for External Sort

2018-02-22 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6180: --- Summary: Use System Option "output_batch_size" for External Sort Key: DRILL-6180 URL: https://issues.apache.org/jira/browse/DRILL-6180 Project: Ap

[jira] [Created] (DRILL-6177) Merge Join - Allocate memory for outgoing value vectors based on sizes of incoming batches.

2018-02-21 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6177: --- Summary: Merge Join - Allocate memory for outgoing value vectors based on sizes of incoming batches. Key: DRILL-6177 URL: https://issues.apache.org/jira/browse/DRILL-6177

[jira] [Created] (DRILL-6166) RecordBatchSizer does not handle hyper vectors

2018-02-18 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6166: --- Summary: RecordBatchSizer does not handle hyper vectors Key: DRILL-6166 URL: https://issues.apache.org/jira/browse/DRILL-6166 Project: Apache Drill

[jira] [Created] (DRILL-6162) Enhance record batch sizer to retain nesting information for map columns.

2018-02-14 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6162: --- Summary: Enhance record batch sizer to retain nesting information for map columns. Key: DRILL-6162 URL: https://issues.apache.org/jira/browse/DRILL-6162

[jira] [Created] (DRILL-6161) Allocate memory for outgoing vectors based on sizing calculations

2018-02-14 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6161: --- Summary: Allocate memory for outgoing vectors based on sizing calculations Key: DRILL-6161 URL: https://issues.apache.org/jira/browse/DRILL-6161 Project

[jira] [Created] (DRILL-6160) Limit batch size for streaming aggregate based on memory

2018-02-14 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6160: --- Summary: Limit batch size for streaming aggregate based on memory Key: DRILL-6160 URL: https://issues.apache.org/jira/browse/DRILL-6160 Project: Apache Drill

Re: Review for PR #1112: Metadata revisions

2018-02-12 Thread Padma Penumarthy
I will review. But, you still need to get a committer to do the final review and approve it. Thanks Padma > On Feb 12, 2018, at 8:37 AM, Paul Rogers wrote: > > Hi All, > Anyone available to review PR #1112? It is another step in committing the > final result set

Re: Batch Sizing for Parquet Flat Reader

2018-02-12 Thread Padma Penumarthy
modify code generation. AFAIK, that is not something anyone is working on, so another advantage of the average batch size method is that it works with the code generation we already have. Thanks, - Paul On Sunday, February 11, 2018, 7:28:52 PM PST, Padma Penumarthy <ppenumar...@mapr.com&

Re: Batch Sizing for Parquet Flat Reader

2018-02-11 Thread Padma Penumarthy
With average row size method, since I know number of rows and the average size for each column, I am planning to use that information to allocate required memory for each vector upfront. This should help avoid copying every time we double and also improve memory utilization. Thanks Padma >

[jira] [Created] (DRILL-6138) Move RecordBatchSizer to org.apache.drill.exec.record package

2018-02-05 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6138: --- Summary: Move RecordBatchSizer to org.apache.drill.exec.record package Key: DRILL-6138 URL: https://issues.apache.org/jira/browse/DRILL-6138 Project: Apache

[jira] [Created] (DRILL-6133) RecordBatchSizer throws IndexOutOfBounds Exception for union vector

2018-02-02 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6133: --- Summary: RecordBatchSizer throws IndexOutOfBounds Exception for union vector Key: DRILL-6133 URL: https://issues.apache.org/jira/browse/DRILL-6133 Project

[jira] [Created] (DRILL-6126) Allocate memory for value vectors upfront in flatten operator

2018-01-30 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6126: --- Summary: Allocate memory for value vectors upfront in flatten operator Key: DRILL-6126 URL: https://issues.apache.org/jira/browse/DRILL-6126 Project: Apache

[jira] [Created] (DRILL-6123) Limit batch size for Merge Join based on memory

2018-01-30 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6123: --- Summary: Limit batch size for Merge Join based on memory Key: DRILL-6123 URL: https://issues.apache.org/jira/browse/DRILL-6123 Project: Apache Drill

Re: [ANNOUNCE] New PMC member: Paul Rogers

2018-01-30 Thread Padma Penumarthy
Congratulations Paul. Thanks Padma > On Jan 30, 2018, at 1:55 PM, Gautam Parai wrote: > > Congratulations Paul! > > > From: Timothy Farkas > Sent: Tuesday, January 30, 2018 1:54:43 PM > To: dev@drill.apache.org > Subject:

[jira] [Created] (DRILL-6113) Limit batch size for Merge Receiver

2018-01-26 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6113: --- Summary: Limit batch size for Merge Receiver Key: DRILL-6113 URL: https://issues.apache.org/jira/browse/DRILL-6113 Project: Apache Drill Issue Type

[jira] [Created] (DRILL-6071) Limit batch size for flatten operator

2018-01-04 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6071: --- Summary: Limit batch size for flatten operator Key: DRILL-6071 URL: https://issues.apache.org/jira/browse/DRILL-6071 Project: Apache Drill Issue Type

[jira] [Created] (DRILL-5972) Slow performance for query on INFORMATION_SCHEMA.TABLE

2017-11-15 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5972: --- Summary: Slow performance for query on INFORMATION_SCHEMA.TABLE Key: DRILL-5972 URL: https://issues.apache.org/jira/browse/DRILL-5972 Project: Apache Drill

Re: Hangout Topics for 11/14/2017

2017-11-13 Thread Padma Penumarthy
Here are the topics so far: Unit Testing - Tim 1.12 Release - Arina Metadata Management - Padma Thanks Padma On Nov 13, 2017, at 1:15 PM, Padma Penumarthy <ppenumar...@mapr.com<mailto:ppenumar...@mapr.com>> wrote: Drill hangout tomorrow Nov 14th, at 10 AM PST. Please send email o

Hangout Topics for 11/14/2017

2017-11-13 Thread Padma Penumarthy
Drill hangout tomorrow Nov 14th, at 10 AM PST. Please send email or bring them up tomorrow, if you have topics to discuss. Hangout link: https://plus.google.com/hangouts/_/event/ci4rdiju8bv04a64efj5fedd0lc Thanks Padma

[jira] [Created] (DRILL-5899) No need to do isAscii check for simple pattern matcher

2017-10-23 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5899: --- Summary: No need to do isAscii check for simple pattern matcher Key: DRILL-5899 URL: https://issues.apache.org/jira/browse/DRILL-5899 Project: Apache Drill

Re: Excessive review comments

2017-10-19 Thread Padma Penumarthy
To subscribe to PRs and JIRAs of interest, we need to know about them first. Is it possible to get notified when a new PR or JIRA is created and not for further updates unless you subscribe to them ? Thanks Padma > On Oct 19, 2017, at 10:13 AM, Paul Rogers wrote: > > So,

Re: Parquet Metadata table on Rolling window

2017-10-16 Thread Padma Penumarthy
No, it does not. We rebuild metadata for all directories under mydata when you query /mydata. If new files get added anywhere in the whole hierarchy, metadata gets regenerated for all of them. However, if you query only /mydata/3 and nothing is changed under 3, no metadata is generated.

[jira] [Created] (DRILL-5854) IllegalStateException when empty batch with valid schema is received.

2017-10-06 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5854: --- Summary: IllegalStateException when empty batch with valid schema is received. Key: DRILL-5854 URL: https://issues.apache.org/jira/browse/DRILL-5854 Project

Re: Parquet Metadata table on Rolling window

2017-10-05 Thread Padma Penumarthy
Unfortunately, we do not do incremental metadata updates. If new files are getting added constantly, refresh table metadata will not help. Thanks Padma > On Oct 5, 2017, at 5:36 PM, François Méthot wrote: > > Hi, > > I have been using drill for more than year now, we

[jira] [Created] (DRILL-5839) Handle Empty Batches in Merge Receiver

2017-10-04 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5839: --- Summary: Handle Empty Batches in Merge Receiver Key: DRILL-5839 URL: https://issues.apache.org/jira/browse/DRILL-5839 Project: Apache Drill Issue Type

Re: IntelliJ code format

2017-09-07 Thread Padma Penumarthy
@weijie, link is updated now with the new jar. Please check it out and let us know if any other issues. Thanks, Padma On Aug 21, 2017, at 8:59 AM, weijie tong <tongweijie...@gmail.com<mailto:tongweijie...@gmail.com>> wrote: @padma ,what's the process? On Wed, 9 Aug 2017 at 1

Re: IntelliJ code format

2017-08-21 Thread Padma Penumarthy
Sorry for the delay. I am working with our doc team to get this link updated. Will let you know once done. Thanks, Padma > On Aug 21, 2017, at 8:59 AM, weijie tong <tongweijie...@gmail.com> wrote: > > @padma ,what's the process? > > On Wed, 9 Aug 2017 at 1:04 AM Padma

Re: Discuss about Drill's schedule policy

2017-08-20 Thread Padma Penumarthy
If control RPC is down to a drillbit i.e if a drillbit is not responding, zookeeper should detect that and notify other drillbits to remove the dead drillbit from their active list. Once that happens, the next query that comes in should not even see that drillbit. We need a way to differentiate

[jira] [Created] (DRILL-5731) regionsToScan is computed multiple times for MapR DB Json and Binary tables

2017-08-18 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5731: --- Summary: regionsToScan is computed multiple times for MapR DB Json and Binary tables Key: DRILL-5731 URL: https://issues.apache.org/jira/browse/DRILL-5731

Re: Failing Tests In Master

2017-08-17 Thread Padma Penumarthy
.work.metadata.TestMetadataProvider > Running > org.apache.drill.exec.work.metadata.TestMetadataProvider#columnsWithColumnNameFilter > Running org.apache.drill.exec.sql.TestViewSupport > ... > > Results : > > Tests run: 82, Failures: 0, Errors: 0, Skipped: 1 > > > On Thu, Aug 17, 2017 at 3:59 PM, Padma

Re: Failing Tests In Master

2017-08-17 Thread Padma Penumarthy
Yes, I do see these failures. Thanks, Padma > On Aug 17, 2017, at 3:55 PM, Timothy Farkas wrote: > > These tests are consistently failing for me with the latest master. Does > anyone else see this issue? > > > Failed tests: > > TestMetadataProvider.tables:153 expected:

filter performance with like

2017-08-10 Thread Padma Penumarthy
I was playing around trying to figure out how to improve performance for simple pattern matching predicates with like. Documented my findings here. https://issues.apache.org/jira/browse/DRILL-5697 Want to find out if folks have any thoughts or feedback on this. Plan is to go with simple

Re: IntelliJ code format

2017-08-08 Thread Padma Penumarthy
You are right. It is configured as 4. We should fix that. Thanks, Padma > On Aug 8, 2017, at 8:12 AM, weijie tong wrote: > > the download site url : > https://drill.apache.org/docs/apache-drill-contribution-guidelines/ > > On Tue, Aug 8, 2017 at 10:59 PM, weijie

[jira] [Created] (DRILL-5697) Improve performance of filter operator for pattern matching

2017-07-31 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5697: --- Summary: Improve performance of filter operator for pattern matching Key: DRILL-5697 URL: https://issues.apache.org/jira/browse/DRILL-5697 Project: Apache

Re: [VOTE] Release Apache Drill 1.11.0 - rc0

2017-07-26 Thread Padma Penumarthy
+1 (non-binding) Tried in embedded mode on my mac. Ran some queries. Downloaded and built on CentOS VM. Installed the build on the 4 node cluster. Ran some queries on parquet files. Thanks, Padma > On Jul 26, 2017, at 2:54 PM, Kunal Khatua wrote: > > +1 (non-binding) >

Re: [HANGOUT] Topics for 7/25/17

2017-07-24 Thread Padma Penumarthy
I have a topic to discuss. Lot of folks on the user mailing list raised the issue of not being able to access all S3 regions using Drill. We need hadoop version 2.8 or higher to be able to connect to regions which support only Version 4 signature. I tried with 2.8.1, which just got released and it

[jira] [Created] (DRILL-5587) Validate Parquet blockSize and pageSize configured with SYSTEM/SESSION option

2017-06-13 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5587: --- Summary: Validate Parquet blockSize and pageSize configured with SYSTEM/SESSION option Key: DRILL-5587 URL: https://issues.apache.org/jira/browse/DRILL-5587

Re: [HANGOUT] Topics for 6/12/17

2017-06-13 Thread Padma Penumarthy
It has links to background information and design docs. Please ask questions and provide comments in the JIRA. For the next meeting, we can choose another topic like this and go in detail. Thanks, Padma On Jun 12, 2017, at 10:21 AM, Padma Penumarthy <ppenumar...@mapr.com<mailto:ppenumar

[HANGOUT] Topics for 6/12/17

2017-06-12 Thread Padma Penumarthy
Drill hangout will be tomorrow, 10 AM PST. In the last hangout, we talked about discussing one of the ongoing Drill projects in detail. Please let me know who wants to volunteer to discuss the topic they are working on - memory fragmentation, spill to disk for hash agg, external sort and

[jira] [Created] (DRILL-5560) Create configuration file for distribution specific configuration

2017-05-31 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5560: --- Summary: Create configuration file for distribution specific configuration Key: DRILL-5560 URL: https://issues.apache.org/jira/browse/DRILL-5560 Project

Re: Parquet, Arrow, and Drill Roadmap

2017-05-02 Thread Padma Penumarthy
One thing I want to add is use_new_reader uses reader from parquet-mr library, where as default one is drill’s native reader which is supposed to be better, performance wise. But, it does not support complex types and we automatically switch to use reader from parquet library when we have to

Re: [Drill 1.10.0] : Memory was leaked by query

2017-04-18 Thread Padma Penumarthy
Seems like you are running into DRILL-5435. Try turning off async parquet reader and see if that helps. alter session set `store.parquet.reader.pagereader.async`=false; Thanks, Padma On Apr 18, 2017, at 6:14 AM, Anup Tiwari

[jira] [Created] (DRILL-5429) Cache tableStats per query for MapR DB JSON Tables

2017-04-11 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5429: --- Summary: Cache tableStats per query for MapR DB JSON Tables Key: DRILL-5429 URL: https://issues.apache.org/jira/browse/DRILL-5429 Project: Apache Drill

Re: Single Hdfs block per parquet file

2017-03-22 Thread Padma Penumarthy
p/fs/FileSystem.html#create(org.apache.hadoop.fs.Path,%20boolean,%20int,%20short,%20long) > > Francois > > On Wed, Mar 22, 2017 at 4:29 PM, Padma Penumarthy <ppenumar...@mapr.com> > wrote: > >> I think we create one file for each parquet block. >>

Re: Single Hdfs block per parquet file

2017-03-22 Thread Padma Penumarthy
I think we create one file for each parquet block. If underlying HDFS block size is 128 MB and parquet block size is > 128MB, it will create more blocks on HDFS. Can you let me know what is the HDFS API that would allow you to do otherwise ? Thanks, Padma > On Mar 22, 2017, at 11:54 AM,

Re: Time for 1.10 release

2017-03-02 Thread Padma Penumarthy
Hi Jinfeng, Please include DRILL-5287, DRILL-5290 and DRILL-5304. Thanks, Padma > On Feb 22, 2017, at 11:16 PM, Jinfeng Ni wrote: > > Hi Drillers, > > It has been almost 3 months since we release Drill 1.9. We have > resolved plenty of fixes and improvements (closed around

[jira] [Created] (DRILL-5290) Provide an option to build operator table once for built-in static functions and reuse it across queries.

2017-02-22 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5290: --- Summary: Provide an option to build operator table once for built-in static functions and reuse it across queries. Key: DRILL-5290 URL: https://issues.apache.org/jira

[jira] [Created] (DRILL-5287) Provide option to skip updates of ephemeral state changes in Zookeeper

2017-02-21 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-5287: --- Summary: Provide option to skip updates of ephemeral state changes in Zookeeper Key: DRILL-5287 URL: https://issues.apache.org/jira/browse/DRILL-5287 Project

  1   2   >