[GitHub] drill issue #1057: DRILL-5993 Append Row Method For VectorContainer (WIP)
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1057 Please let me know if I should make any more changes. ---
[GitHub] drill issue #1057: DRILL-5993 Append Row Method For VectorContainer (WIP)
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1057 @paul-rogers since HashJoin does not need to support Selection Vectors maybe we can postpone adding the corresponding appendRow methods if and when they are needed. I suspect by the time anyone needs those methods we will have already migrated over to the new batch framework. ---
[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...
Github user ilooner commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r153958856 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/record/TestVectorContainer.java --- @@ -124,4 +132,52 @@ public void testContainerMerge() { leftIndirect.clear(); right.clear(); } + + @Test + public void testAppendRow() + { +MaterializedField colA = MaterializedField.create("colA", Types.required(TypeProtos.MinorType.INT)); +MaterializedField colB = MaterializedField.create("colB", Types.required(TypeProtos.MinorType.INT)); --- End diff -- Added more interesting types. Currently RowSet classes don't support the Map data type. Paul asked me to look into adding support for this a while ago for DRILL-5870 . I'll update the test framework to support that in the next PR. ---
[GitHub] drill issue #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout(int)
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/1024 @laurentgo I've added server-triggered timeout tests and made other changes as well, but they require support for [DRILL-5973](https://issues.apache.org/jira/browse/DRILL-5973) . I tested this commit (#1024 ) as a cherry pick on top of that PR's commit (#1055) and I was able to simulate the server-induced timeout. Will need a +1 for that PR before I can enable the tests here. For now, I've marked these tests as `@ignore` to ensure that the remaining tests pass and the feature works as intended. Can you review them both (this and #1055 )? ---
[GitHub] drill pull request #1058: DRILL-6002: Avoid memory copy from direct buffer t...
GitHub user vrozov opened a pull request: https://github.com/apache/drill/pull/1058 DRILL-6002: Avoid memory copy from direct buffer to heap while spilling to local disk @paul-rogers Please review You can merge this pull request into a Git repository by running: $ git pull https://github.com/vrozov/drill DRILL-6002 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1058.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1058 commit 8e9124de681d3a8cd70bf0bb243460cb78dcb295 Author: Vlad Rozov Date: 2017-11-22T22:06:13Z DRILL-6002: Avoid memory copy from direct buffer to heap while spilling to local disk ---
[jira] [Created] (DRILL-6002) Avoid memory copy from direct buffer to heap while spilling to local disk
Vlad Rozov created DRILL-6002: - Summary: Avoid memory copy from direct buffer to heap while spilling to local disk Key: DRILL-6002 URL: https://issues.apache.org/jira/browse/DRILL-6002 Project: Apache Drill Issue Type: Improvement Reporter: Vlad Rozov Assignee: Vlad Rozov When spilling to a local disk or to any file system that supports WritableByteChannel it is preferable to avoid copy from off-heap to java heap as WritableByteChannel can work directly with the off-heap memory. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...
Github user Ben-Zvi commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r153934729 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java --- @@ -353,6 +353,23 @@ public int getRecordCount() { public boolean hasRecordCount() { return recordCount != -1; } + /** + * This works with non-hyper {@link VectorContainer}s which have no selection vectors. + * Appends a row taken from a source {@link VectorContainer} to this {@link VectorContainer}. + * @param srcContainer The {@link VectorContainer} to copy a row from. + * @param srcIndex The index of the row to copy from the source {@link VectorContainer}. + */ + public void appendRow(VectorContainer srcContainer, int srcIndex) { +for (int vectorIndex = 0; vectorIndex < wrappers.size(); vectorIndex++) { + ValueVector destVector = wrappers.get(vectorIndex).getValueVector(); + ValueVector srcVector = srcContainer.wrappers.get(vectorIndex).getValueVector(); + + destVector.copyEntry(recordCount, srcVector, srcIndex); +} + +recordCount++; --- End diff -- The immediate need for appendRow() is to distribute rows from a single incoming batch into multiple other batches (for the Hash Join internal partitioning), based on the hash value of the key columns at each row. This would not work well with the second suggestion (vectorizing - column 1, column 2, etc.) ---
[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...
Github user Ben-Zvi commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r153935071 --- Diff: exec/java-exec/src/test/java/org/apache/drill/exec/record/TestVectorContainer.java --- @@ -124,4 +132,52 @@ public void testContainerMerge() { leftIndirect.clear(); right.clear(); } + + @Test + public void testAppendRow() + { +MaterializedField colA = MaterializedField.create("colA", Types.required(TypeProtos.MinorType.INT)); +MaterializedField colB = MaterializedField.create("colB", Types.required(TypeProtos.MinorType.INT)); --- End diff -- Maybe add some "interesting" datatypes ? Testing integers only may miss some issue. ---
[jira] [Created] (DRILL-6001) Deprecate using assertions (-ea) to enable direct memory allocation tracing.
Vlad Rozov created DRILL-6001: - Summary: Deprecate using assertions (-ea) to enable direct memory allocation tracing. Key: DRILL-6001 URL: https://issues.apache.org/jira/browse/DRILL-6001 Project: Apache Drill Issue Type: Improvement Reporter: Vlad Rozov Assignee: Vlad Rozov Priority: Minor Drill uses assertion (-ea) to enable memory allocation tracing. Most of the time assertions are enabled/disabled globally (for all packages) by using "-ea" java command line option and it leads to excessive CPU and heap utilization. It will be better to limit the impact of assertion enabled to the java "assert" statement as expected by a majority of Java developers and use a separate property (that already exists) to enable/disable direct memory allocation tracing/debugging. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...
Github user paul-rogers commented on a diff in the pull request: https://github.com/apache/drill/pull/1057#discussion_r153926390 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java --- @@ -353,6 +353,23 @@ public int getRecordCount() { public boolean hasRecordCount() { return recordCount != -1; } + /** + * This works with non-hyper {@link VectorContainer}s which have no selection vectors. + * Appends a row taken from a source {@link VectorContainer} to this {@link VectorContainer}. + * @param srcContainer The {@link VectorContainer} to copy a row from. + * @param srcIndex The index of the row to copy from the source {@link VectorContainer}. + */ + public void appendRow(VectorContainer srcContainer, int srcIndex) { +for (int vectorIndex = 0; vectorIndex < wrappers.size(); vectorIndex++) { + ValueVector destVector = wrappers.get(vectorIndex).getValueVector(); + ValueVector srcVector = srcContainer.wrappers.get(vectorIndex).getValueVector(); + + destVector.copyEntry(recordCount, srcVector, srcIndex); +} + +recordCount++; --- End diff -- This is OK for a row-by-row copy. But, you'll get better performance if you optimize for the entire batch. Because you have no SV4, the source and dest batches are the same so the vectors can be preloaded into an array of vectors to avoid the vector wrapper lookup per column. Plus, if the code is written per batch, you can go a step further: vectorize the operation. Copy all values for column 1, then all for column 2, and so on. (In this case, you only get each vector once, so sticking with the wrappers is fine.) By vectorizing, you may get the vectorized cache-locality benefit that Drill promises from its operations. Worth a try to see if you get any speed-up. ---
[GitHub] drill issue #1057: DRILL-5993 Append Row Method For VectorContainer (WIP)
Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1057 To answer the two questions: 1. The copier is used in multiple locations, some of which include selection vectors. Sort uses a copier to merge rows coming from multiple sorted batches. The SVR compresses out SVs. A filter will produce an SV2 which the SVR removes. An in-memory sort produces an SV4. But, because of the ways plans are generated, the hash join will never see a batch with an SV. (An SVR will be inserted, if needed, to remove the SV.) 2. We never write a batch using an SV. The SV is always a source indirection. Because we do indirection on the source side (and vectors are append only), there can be no SV on the destination side. Note also that the {{VectorContainer}} class, despite it's API, knows nothing about SVs. The SV is tacked on separately by the {{RecordBatch}}. (This is a less-than-ideal design, but it is how things work at present.) FWIW, the test-oriented {{RowSet}} abstractions came about as wrappers around both the {{VectorContainer}} and SV to provide a unified view. Because of how we do SVs, you'll need three copy methods: one for no SV, one for an SV2 and another for an SV4. In the fullness of time, the new "column reader" and "column writer" abstractions will hide all this stuff, but it will take time before those tools come online. ---
[GitHub] drill issue #1057: DRILL-5993 Append Row Method For VectorContainer (WIP)
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1057 @Ben-Zvi @paul-rogers ---
[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...
GitHub user ilooner opened a pull request: https://github.com/apache/drill/pull/1057 DRILL-5993 Append Row Method For VectorContainer (WIP) ## Motivation HashJoin requires a method that can take a row from a VectorContainer and append it to a destination VectorContainer. This is a WIP and this PR is mainly opened to improve my understanding. ## Implementation This is an initial implementation that works with simple VectorContainers that are not hyper batches and do not have selection vectors. It is also assumed that the user called **SchemaUtil.coerceContainer** on the source VectorContainer before using the newly added **appendRow** method. ## Questions - Do we have to worry about selection vectors in the source container? - Do we have to think about support hyper batches in the destination container? You can merge this pull request into a Git repository by running: $ git pull https://github.com/ilooner/drill DRILL-5993 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1057.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1057 commit ee43d6808562a1ff60c17fa7622b8358b63c7276 Author: Timothy Farkas Date: 2017-11-29T20:38:41Z - Initial Implementation of append row for a vector container ---
[GitHub] drill pull request #1049: DRILL-5971: Fix INT64, INT32 logical types in comp...
Github user parthchandra commented on a diff in the pull request: https://github.com/apache/drill/pull/1049#discussion_r153909807 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java --- @@ -138,6 +138,8 @@ return new ParquetFixedWidthDictionaryReaders.DictionaryBigIntReader(recordReader, allocateSize, descriptor, columnChunkMetaData, fixedLength, (BigIntVector) v, schemaElement); --- End diff -- OK I'll try to add these. BTW, I realized that the test files that I added for the unit tests are not annotated, so I'll need to fix those as well! ---
[GitHub] drill issue #1056: DRILL-6000: Categorized graceful shutdown unit tests as S...
Github user ilooner commented on the issue: https://github.com/apache/drill/pull/1056 @arina-ielchiieva ---
[GitHub] drill pull request #1056: DRILL-6000: Categorized graceful shutdown unit tes...
GitHub user ilooner opened a pull request: https://github.com/apache/drill/pull/1056 DRILL-6000: Categorized graceful shutdown unit tests as SlowTests Graceful shutdown unit tests were failing on Travis, and should not be run as part of the SmokeTests You can merge this pull request into a Git repository by running: $ git pull https://github.com/ilooner/drill DRILL-6000 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1056.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1056 commit 5ecaade4b5cdc91a3153a4e2394cfdd993eeb5cf Author: Timothy Farkas Date: 2017-11-29T18:52:29Z DRILL-6000: Categorized graceful shutdown unit tests as SlowTests ---
[GitHub] drill issue #1055: DRILL-5973 : Support injection of time-bound pauses in se...
Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/1055 @laurentgo / @parthchandra Please review this. It is the basis for unit tests in DRILL-3640 ---
[GitHub] drill pull request #1055: DRILL-5973 : Support injection of time-bound pause...
GitHub user kkhatua opened a pull request: https://github.com/apache/drill/pull/1055 DRILL-5973 : Support injection of time-bound pauses in server Support pause injections in the test framework that are time-bound, to allow for testing high latency scenarios. e.g. delayed server response to the Drill client allows for test a server-induced timeout This would allow for testing of DRILL-3640 You can merge this pull request into a Git repository by running: $ git pull https://github.com/kkhatua/drill DRILL-5973 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1055.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1055 commit 62e6f721183d648797d5329e94b277cd5722bba6 Author: Kunal Khatua Date: 2017-11-29T19:00:11Z DRILL-5973 : Support injection of time-bound pauses in server Support pause injections in the test framework that are time-bound, to allow for testing high latency scenarios. e.g. delayed server response to the Drill client allows for test a server-induced timeout ---
[jira] [Created] (DRILL-6000) Graceful Shutdown Unit Tests Should Not Be Run On Travis
Timothy Farkas created DRILL-6000: - Summary: Graceful Shutdown Unit Tests Should Not Be Run On Travis Key: DRILL-6000 URL: https://issues.apache.org/jira/browse/DRILL-6000 Project: Apache Drill Issue Type: Bug Reporter: Timothy Farkas Assignee: Timothy Farkas -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: [EXT] Re: Food for thought about intra-document operation
Damien, for the intra-document operations, it would be useful to add support for LATERAL joins (SQL standard), which in conjunction with UNNEST (or FLATTEN) should address the use case you have. I have filed a JIRA for this: https://issues.apache.org/jira/browse/DRILL-5999. -Aman On Tue, Sep 26, 2017 at 12:04 AM, Damien Profeta wrote: > Hello Aman, > > AsterixDb seems to follow the standard SQL with a few minor modifications > and add functions to ease aggregations (array_count, array_avg…) > > That would tend to confirm at least that the support of unnest is a good > idea to improve Drill. > > Best regards > > Damien > > ** > > On 09/25/2017 07:53 PM, Aman Sinha wrote: > >> Damien, >> thanks for initiating the discussion..indeed this would be a very useful >> enhancement. Currently, Drill provides repeated_contains() for filtering >> and repeated_count() for count aggregates on arrays but not the general >> purpose intra-document operations that you need based on your example. >> I haven't gone through all the alternatives but in addition to what you >> have described, you might also want to look at SQL++ ( >> https://ci.apache.org/projects/asterixdb/sqlpp/manual.html) which has >> been >> adopted by AsterixDB and has syntax extensions to SQL for unstructured >> data. >> >> -Aman >> >> On Mon, Sep 25, 2017 at 6:10 AM, Damien Profeta < >> damien.prof...@amadeus.com> >> wrote: >> >> Hello, >>> >>> A few format handled by Drill enable to work with document, meaning >>> nested >>> and repeated structure instead of just tables. Json and Parquet are the >>> two >>> that come to my mind right now. Document modeling is a great way to >>> express >>> complex object and is used a lot in my company. Drill is able to handle >>> them but unfortunately, it cannot make much computation on it. By >>> computation I mean, filtering branches of the document, computing >>> statistics (avg, min, max) on part of the document … That would be very >>> useful as an analytic tools. >>> >>> _What can be done_ >>> >>> The question then is how to express the computation we want to do on the >>> document. I have found multiple ways to handle that and I don't really >>> know >>> which one is the best hence the mail to expose what I have found to >>> initiate discussion, maybe. >>> >>> First, in we look back at the Dremel paper which is the base of the >>> parquet format and also one of the example for drill, dremel is adding >>> the >>> special keyword "WITHIN" to SQL to specify that the computation has to be >>> done within a document. What is very powerful with this keyword is that >>> it >>> allows you to generate document and doesn't force you to flatten >>> everything. You can find exemple of it usage in the google successor of >>> Dremel: BigQuery and its documentation : https://cloud.google.com/bigqu >>> ery/docs/legacy-nested-repeated. >>> >>> But it seems that it was problematic for Google, because they now propose >>> a SQL that seems to be compliant with SQL 2011 for Bigquery to handle >>> such >>> computation. I am not familiar with SQL 2011 but it is told in BigQuery >>> documentation to integrated the keywords for nested and repeated >>> structure. >>> You can have a view about how this is done in BigQuery here: >>> https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays . >>> Basically, what I have seen is that they leverage UNNEST and ARRAY >>> keyword >>> and then are able to use JOIN or CROSS JOIN to describe the aggregation. >>> >>> In Impala, they have added a way to add a subquery on a complex type in >>> such a way that the subquery only act intra-document. I have no idea if >>> this is standard SQL or not. In page https://www.cloudera.com/docum >>> entation/enterprise/5-5-x/topics/impala_complex_types.html#complex_types >>> look at the phrase: “The subquery labelled SUBQ1 is correlated:” for >>> example. >>> >>> In Presto, you can apply lambda function to map/array to transform the >>> structure and apply filter on it. So you have filter, map_filter function >>> to filter array and map respectively. (cf https://prestodb.io/docs/curre >>> nt/functions/lambda.html#filter) >>> >>> _Example_ >>> >>> If I want to make a short example, let’s say we have a flight with a >>> group >>> of passengers in it. A document would be : >>> >>> { “flightnb”:1234, “group”:[{“age”:30,”gender”:”M >>> ”},{“age”:15,”gender”:”F”}, >>> {“age”:10,”gender”:”F”},{“age”:30,”gender”:”F”}]} >>> >>> The database would be millions of such document and I want to know the >>> average age of the male passenger for every flight. >>> >>> In Dremel, the query would be something like: select flightnb, >>> avg(male_age) within record from (select groups.age as male_age from >>> flight >>> where group.gender = "M") >>> >>> With sql, it would be something like: select flightnb, avg(male_age) from >>> (array(select g.age as male_age from unnest(group)as g where g.gender = >>> "M") as male_age) >>> >>> With impala it would be something
[jira] [Created] (DRILL-5999) Add support for LATERAL join
Aman Sinha created DRILL-5999: - Summary: Add support for LATERAL join Key: DRILL-5999 URL: https://issues.apache.org/jira/browse/DRILL-5999 Project: Apache Drill Issue Type: New Feature Components: Query Planning & Optimization Affects Versions: 1.11.0 Reporter: Aman Sinha The LATERAL keyword in SQL standard can precede a sub-SELECT FROM item. This allows the sub-SELECT to refer to columns of FROM items that appear before it in the FROM list. (Without LATERAL, each sub-SELECT is evaluated independently and so cannot cross-reference any other FROM item.) Calcite supports the LATERAL syntax. In Drill, we should add support for it in the planning and execution phase. The main motivation of supporting it is it makes it more expressive and performant to handling complex types such as arrays and maps. For instance, suppose you have a customer table which contains 1 row per customer containing customer-id, name and an array of Orders corresponding to each customer. Suppose you want to find out for each customer what is the average order amount. This could be expressed as follows using SQL standard LATERAL and UNNEST syntax: {noformat} SELECT customer_name FROM customers c LATERAL (SELECT AVG(order_amount) FROM UNNEST(c.orders)); {noformat} The subquery may contain other operations such as filtering etc which operate on the output of the un-nested c.orders array. The UNNEST operation is supported in Drill today using FLATTEN operator. More details of the use cases for LATERAL is available from existing product documentations .. e.g see [1]. [1] https://www.postgresql.org/docs/9.4/static/queries-table-expressions.html -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[VOTE] Release Apache Drill 1.12.0 - rc0
Hi all, I'd like to propose the first release candidate (rc0) of Apache Drill, version 1.12.0. The release candidate covers a total of 167 resolved JIRAs [1]. Thanks to everyone who contributed to this release. The tarball artifacts are hosted at [2] and the maven artifacts are hosted at [3]. This release candidate is based on commit 54d3d201882ef5bc2e0f754fd10edfead9947b60 located at [4]. The vote ends at 3:00 PM UTC (7:00 AM PT), December 1, 2017. [ ] +1 [ ] +0 [ ] -1 Here's my vote: +1 [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12341087 [2] http://home.apache.org/~arina/drill/releases/1.12.0/rc0/ [3] https://repository.apache.org/content/repositories/orgapachedrill-1043/ [4] https://github.com/arina-ielchiieva/drill/commits/drill-1.12.0 Kind regards Arina
Re: [DISCUSS] Drill 1.12.0 release
All pending Jiras are merged. Starting release candidate preparation. *Note for the committers:* *until the release is not over and Drill version is not changed to 1.13.0-SNAPSHOT, please do not push any changes into Drill master. * Kind regards Arina On Mon, Nov 27, 2017 at 6:52 PM, Arina Yelchiyeva < arina.yelchiy...@gmail.com> wrote: > Current status: > > DRILL-4779: Kafka storage plugin support (developer - Anil & Kamesh, code > reviewer - Paul) - fix is expected by the EOD in new pull request. > DRILL-4286: Graceful shutdown of drillbit (developer - Jyothsna, code > reviewer - Paul) - unit test failures are fixed. Unit test performance > degraded x3 times! > > Kind regards > > On Sun, Nov 26, 2017 at 6:15 PM, Arina Yelchiyeva < > arina.yelchiy...@gmail.com> wrote: > >> Current status: >> >> DRILL-4779: Kafka storage plugin support (developer - Anil & Kamesh, code >> reviewer - Paul) - could not cherry-pick the commits. Needs fix. >> DRILL-4286: Graceful shutdown of drillbit (developer - Jyothsna, code >> reviewer - Paul) - there are unit test failures. Needs fix. >> >> Kind regards >> >> On Sat, Nov 25, 2017 at 11:53 PM, AnilKumar B >> wrote: >> >>> Hi Arina, >>> >>> Sorry for the delay. Just now we squashed Kafka storage plugin commits >>> into >>> one commit and pushed. >>> >>> Thanks & Regards, >>> B Anil Kumar. >>> >>> On Sat, Nov 25, 2017 at 5:56 AM, Arina Yelchiyeva < >>> arina.yelchiy...@gmail.com> wrote: >>> >>> > Current status: >>> > >>> > DRILL-4779: Kafka storage plugin support (developer - Anil & Kamesh, >>> code >>> > reviewer - Paul) - needs to squash the commits. >>> > DRILL-4286: Graceful shutdown of drillbit (developer - Jyothsna, code >>> > reviewer - Paul) - needs to address some code review comments. >>> > >>> > Kind regards >>> > Arina >>> > >>> > On Wed, Nov 15, 2017 at 2:38 PM, Arina Yelchiyeva < >>> > arina.yelchiy...@gmail.com> wrote: >>> > >>> > > Current status, we are close to the code freeze which will happen not >>> > > later then the end of the next week. >>> > > >>> > > Blocker: >>> > > DRILL-5917: Ban org.json:json library in Drill (developer - Vlad R., >>> code >>> > > reviewer - Arina) - in progress. >>> > > >>> > > Targeted for 1.12 release: >>> > > DRILL-4779: Kafka storage plugin support (developer - Anil & Kamesh, >>> code >>> > > reviewer - Paul) - needs next round of code review. >>> > > DRILL-5943: Avoid the strong check introduced by DRILL-5582 for PLAIN >>> > > mechanism (developer - Sorabh, code reviewer - Parth & Laurent) - >>> waiting >>> > > for Parth code review. >>> > > DRILL-5771: Fix serDe errors for format plugins (developer - Arina, >>> code >>> > > reviewer - Tim) - code review is done, waiting for the merge. >>> > > >>> > > Kind regards >>> > > >>> > > On Fri, Nov 10, 2017 at 9:32 AM, Chunhui Shi wrote: >>> > > >>> > >> Hi Arina, >>> > >> >>> > >> >>> > >> Could we consider to include DRILL-5089 in 1.12.0? It is about lazy >>> > >> loading schema for storage plugins. Could you or Paul take a look >>> at the >>> > >> pull request for this JIRA https://github.com/apache/dril >>> l/pull/1032? I >>> > >> think both of you are familiar with this part. >>> > >> >>> > >> >>> > >> Thanks, >>> > >> >>> > >> >>> > >> Chunhui >>> > >> >>> > >> >>> > >> From: Arina Yelchiyeva >>> > >> Sent: Thursday, November 9, 2017 8:11:35 AM >>> > >> To: dev@drill.apache.org >>> > >> Subject: Re: [DISCUSS] Drill 1.12.0 release >>> > >> >>> > >> Yes, they are already in master. >>> > >> >>> > >> On Thu, Nov 9, 2017 at 6:05 PM, Charles Givre >>> wrote: >>> > >> >>> > >> > We’re including the Networking functions in this release right? >>> > >> > >>> > >> > > On Nov 9, 2017, at 11:04, Arina Yelchiyeva < >>> > >> arina.yelchiy...@gmail.com> >>> > >> > wrote: >>> > >> > > >>> > >> > > If changes will be done before cut off date, targeting mid >>> November >>> > >> that >>> > >> > it >>> > >> > > will be possible to include this Jira. >>> > >> > > >>> > >> > > On Thu, Nov 9, 2017 at 6:03 PM, Charles Givre >> > >>> > >> wrote: >>> > >> > > >>> > >> > >> Hi Arina, >>> > >> > >> Can we include DRILL-4091 Support for additional GIS >>> operations in >>> > >> > version >>> > >> > >> 1.12? In general the code looked pretty good. There was a >>> unit >>> > test >>> > >> > >> missing which the developer submitted and some minor formatting >>> > >> issues >>> > >> > >> which I’m still waiting on. >>> > >> > >> Thanks, >>> > >> > >> —C >>> > >> > >> >>> > >> > >> >>> > >> > >> >>> > >> > >>> On Nov 9, 2017, at 10:58, Arina Yelchiyeva < >>> > >> arina.yelchiy...@gmail.com >>> > >> > > >>> > >> > >> wrote: >>> > >> > >>> >>> > >> > >>> Current status: >>> > >> > >>> >>> > >> > >>> Blocker: >>> > >> > >>> DRILL-5917: Ban org.json:json library in Drill (developer - >>> Vlad >>> > R., >>> > >> > code >>> > >> > >>> reviewer - ?) - in progress. >>> > >> > >>> >>> > >> > >>> Targeted for 1.12 release: >>> > >> > >>> DRILL-5337: OpenTSDB plugin (develo
[GitHub] drill pull request #1050: DRILL-5964: Do not allow queries to access paths o...
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1050 ---
[GitHub] drill pull request #1053: DRILL-5989 Travis Finally Runs Smoke Tests!!!
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1053 ---
[GitHub] drill pull request #1038: DRILL-5972: Slow performance for query on INFORMAT...
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1038 ---
[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit
Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/921 ---
[GitHub] drill pull request #1049: DRILL-5971: Fix INT64, INT32 logical types in comp...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/1049#discussion_r153739725 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java --- @@ -138,6 +138,8 @@ return new ParquetFixedWidthDictionaryReaders.DictionaryBigIntReader(recordReader, allocateSize, descriptor, columnChunkMetaData, fixedLength, (BigIntVector) v, schemaElement); --- End diff -- `DATE` logical type also encoded as the `INT32` physical type [1], so could you please also add its support? [1] https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#date ---
[GitHub] drill pull request #1049: DRILL-5971: Fix INT64, INT32 logical types in comp...
Github user vvysotskyi commented on a diff in the pull request: https://github.com/apache/drill/pull/1049#discussion_r153735994 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java --- @@ -138,6 +138,8 @@ return new ParquetFixedWidthDictionaryReaders.DictionaryBigIntReader(recordReader, allocateSize, descriptor, columnChunkMetaData, fixedLength, (BigIntVector) v, schemaElement); --- End diff -- This comment had to be placed in line [117](https://github.com/apache/drill/pull/1049/files?diff=unified#diff-4a7ec07122bfb16e4ff696af256f56dcR117), but I could not add it there. Should we also handle the case when `columnChunkMetaData.getType()` type is `INT32` and `convertedType` is `INT_32`? ---