[GitHub] drill issue #1057: DRILL-5993 Append Row Method For VectorContainer (WIP)

2017-11-29 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1057
  
Please let me know if I should make any more changes.


---


[GitHub] drill issue #1057: DRILL-5993 Append Row Method For VectorContainer (WIP)

2017-11-29 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1057
  
@paul-rogers since HashJoin does not need to support Selection Vectors 
maybe we can postpone adding the corresponding appendRow methods if and when 
they are needed. I suspect by the time anyone needs those methods we will have 
already migrated over to the new batch framework.


---


[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...

2017-11-29 Thread ilooner
Github user ilooner commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r153958856
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/record/TestVectorContainer.java
 ---
@@ -124,4 +132,52 @@ public void testContainerMerge() {
 leftIndirect.clear();
 right.clear();
   }
+
+  @Test
+  public void testAppendRow()
+  {
+MaterializedField colA = MaterializedField.create("colA", 
Types.required(TypeProtos.MinorType.INT));
+MaterializedField colB = MaterializedField.create("colB", 
Types.required(TypeProtos.MinorType.INT));
--- End diff --

Added more interesting types. Currently RowSet classes don't support the 
Map data type. Paul asked me to look into adding support for this a while ago 
for DRILL-5870 . I'll update the test framework to support that in the next PR.


---


[GitHub] drill issue #1024: DRILL-3640: Support JDBC Statement.setQueryTimeout(int)

2017-11-29 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1024
  
@laurentgo 
I've added server-triggered timeout tests and made other changes as well, 
but they require support for 
[DRILL-5973](https://issues.apache.org/jira/browse/DRILL-5973) . I tested this 
commit (#1024 ) as a cherry pick on top of that PR's commit (#1055) and I was 
able to simulate the server-induced timeout.
Will need a +1 for that PR before I can enable the tests here.
For now, I've marked these tests as `@ignore` to ensure that the remaining 
tests pass and the feature works as intended. 

Can you review them both (this and #1055 )?


---


[GitHub] drill pull request #1058: DRILL-6002: Avoid memory copy from direct buffer t...

2017-11-29 Thread vrozov
GitHub user vrozov opened a pull request:

https://github.com/apache/drill/pull/1058

DRILL-6002: Avoid memory copy from direct buffer to heap while spilling to 
local disk

@paul-rogers Please review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/vrozov/drill DRILL-6002

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1058.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1058


commit 8e9124de681d3a8cd70bf0bb243460cb78dcb295
Author: Vlad Rozov 
Date:   2017-11-22T22:06:13Z

DRILL-6002: Avoid memory copy from direct buffer to heap while spilling to 
local disk




---


[jira] [Created] (DRILL-6002) Avoid memory copy from direct buffer to heap while spilling to local disk

2017-11-29 Thread Vlad Rozov (JIRA)
Vlad Rozov created DRILL-6002:
-

 Summary: Avoid memory copy from direct buffer to heap while 
spilling to local disk
 Key: DRILL-6002
 URL: https://issues.apache.org/jira/browse/DRILL-6002
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Vlad Rozov
Assignee: Vlad Rozov


When spilling to a local disk or to any file system that supports 
WritableByteChannel it is preferable to avoid copy from off-heap to java heap 
as WritableByteChannel can work directly with the off-heap memory.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...

2017-11-29 Thread Ben-Zvi
Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r153934729
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java 
---
@@ -353,6 +353,23 @@ public int getRecordCount() {
 
   public boolean hasRecordCount() { return recordCount != -1; }
 
+  /**
+   * This works with non-hyper {@link VectorContainer}s which have no 
selection vectors.
+   * Appends a row taken from a source {@link VectorContainer} to this 
{@link VectorContainer}.
+   * @param srcContainer The {@link VectorContainer} to copy a row from.
+   * @param srcIndex The index of the row to copy from the source {@link 
VectorContainer}.
+   */
+  public void appendRow(VectorContainer srcContainer, int srcIndex) {
+for (int vectorIndex = 0; vectorIndex < wrappers.size(); 
vectorIndex++) {
+  ValueVector destVector = wrappers.get(vectorIndex).getValueVector();
+  ValueVector srcVector = 
srcContainer.wrappers.get(vectorIndex).getValueVector();
+
+  destVector.copyEntry(recordCount, srcVector, srcIndex);
+}
+
+recordCount++;
--- End diff --

 The immediate need for appendRow() is to distribute rows from a single 
incoming batch into multiple other batches (for the Hash Join internal 
partitioning), based on the hash value of the key columns at each row. This 
would not work well with the second suggestion (vectorizing - column 1, column 
2, etc.)  


---


[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...

2017-11-29 Thread Ben-Zvi
Github user Ben-Zvi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r153935071
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/record/TestVectorContainer.java
 ---
@@ -124,4 +132,52 @@ public void testContainerMerge() {
 leftIndirect.clear();
 right.clear();
   }
+
+  @Test
+  public void testAppendRow()
+  {
+MaterializedField colA = MaterializedField.create("colA", 
Types.required(TypeProtos.MinorType.INT));
+MaterializedField colB = MaterializedField.create("colB", 
Types.required(TypeProtos.MinorType.INT));
--- End diff --

 Maybe add some "interesting" datatypes ?  Testing integers only may miss 
some issue. 



---


[jira] [Created] (DRILL-6001) Deprecate using assertions (-ea) to enable direct memory allocation tracing.

2017-11-29 Thread Vlad Rozov (JIRA)
Vlad Rozov created DRILL-6001:
-

 Summary: Deprecate using assertions (-ea) to enable direct memory 
allocation tracing.
 Key: DRILL-6001
 URL: https://issues.apache.org/jira/browse/DRILL-6001
 Project: Apache Drill
  Issue Type: Improvement
Reporter: Vlad Rozov
Assignee: Vlad Rozov
Priority: Minor


Drill uses assertion (-ea) to enable memory allocation tracing. Most of the 
time assertions are enabled/disabled globally (for all packages) by using "-ea" 
java command line option and it leads to excessive CPU and heap utilization. It 
will be better to limit the impact of assertion enabled to the java "assert" 
statement as expected by a majority of Java developers and use a separate 
property (that already exists) to enable/disable direct memory allocation 
tracing/debugging.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...

2017-11-29 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/1057#discussion_r153926390
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/record/VectorContainer.java 
---
@@ -353,6 +353,23 @@ public int getRecordCount() {
 
   public boolean hasRecordCount() { return recordCount != -1; }
 
+  /**
+   * This works with non-hyper {@link VectorContainer}s which have no 
selection vectors.
+   * Appends a row taken from a source {@link VectorContainer} to this 
{@link VectorContainer}.
+   * @param srcContainer The {@link VectorContainer} to copy a row from.
+   * @param srcIndex The index of the row to copy from the source {@link 
VectorContainer}.
+   */
+  public void appendRow(VectorContainer srcContainer, int srcIndex) {
+for (int vectorIndex = 0; vectorIndex < wrappers.size(); 
vectorIndex++) {
+  ValueVector destVector = wrappers.get(vectorIndex).getValueVector();
+  ValueVector srcVector = 
srcContainer.wrappers.get(vectorIndex).getValueVector();
+
+  destVector.copyEntry(recordCount, srcVector, srcIndex);
+}
+
+recordCount++;
--- End diff --

This is OK for a row-by-row copy. But, you'll get better performance if you 
optimize for the entire batch. Because you have no SV4, the source and dest 
batches are the same so the vectors can be preloaded into an array of vectors 
to avoid the vector wrapper lookup per column.

Plus, if the code is written per batch, you can go a step further: 
vectorize the operation. Copy all values for column 1, then all for column 2, 
and so on. (In this case, you only get each vector once, so sticking with the 
wrappers is fine.) By vectorizing, you may get the vectorized cache-locality 
benefit that Drill promises from its operations. Worth a try to see if you get 
any speed-up.


---


[GitHub] drill issue #1057: DRILL-5993 Append Row Method For VectorContainer (WIP)

2017-11-29 Thread paul-rogers
Github user paul-rogers commented on the issue:

https://github.com/apache/drill/pull/1057
  
To answer the two questions:

1. The copier is used in multiple locations, some of which include 
selection vectors. Sort uses a copier to merge rows coming from multiple sorted 
batches. The SVR compresses out SVs. A filter will produce an SV2 which the SVR 
removes. An in-memory sort produces an SV4. But, because of the ways plans are 
generated, the hash join will never see a batch with an SV. (An SVR will be 
inserted, if needed, to remove the SV.)

2. We never write a batch using an SV. The SV is always a source 
indirection. Because we do indirection on the source side (and vectors are 
append only), there can be no SV on the destination side.

Note also that the {{VectorContainer}} class, despite it's API, knows 
nothing about SVs. The SV is tacked on separately by the {{RecordBatch}}. (This 
is a less-than-ideal design, but it is how things work at present.) FWIW, the 
test-oriented {{RowSet}} abstractions came about as wrappers around both the 
{{VectorContainer}} and SV to provide a unified view.

Because of how we do SVs, you'll need three copy methods: one for no SV, 
one for an SV2 and another for an SV4.

In the fullness of time, the new "column reader" and "column writer" 
abstractions will hide all this stuff, but it will take time before those tools 
come online.


---


[GitHub] drill issue #1057: DRILL-5993 Append Row Method For VectorContainer (WIP)

2017-11-29 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1057
  
@Ben-Zvi @paul-rogers 


---


[GitHub] drill pull request #1057: DRILL-5993 Append Row Method For VectorContainer (...

2017-11-29 Thread ilooner
GitHub user ilooner opened a pull request:

https://github.com/apache/drill/pull/1057

DRILL-5993 Append Row Method For VectorContainer (WIP)

## Motivation

HashJoin requires a method that can take a row from a VectorContainer and 
append it to a destination VectorContainer. This is a WIP and this PR is mainly 
opened to improve my understanding.

## Implementation

This is an initial implementation that works with simple VectorContainers 
that are not hyper batches and do not have selection vectors. It is also 
assumed that the user called **SchemaUtil.coerceContainer** on the source 
VectorContainer before using the newly added **appendRow** method.

## Questions

- Do we have to worry about selection vectors in the source container?
- Do we have to think about support hyper batches in the destination 
container?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ilooner/drill DRILL-5993

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1057.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1057


commit ee43d6808562a1ff60c17fa7622b8358b63c7276
Author: Timothy Farkas 
Date:   2017-11-29T20:38:41Z

 - Initial Implementation of append row for a vector container




---


[GitHub] drill pull request #1049: DRILL-5971: Fix INT64, INT32 logical types in comp...

2017-11-29 Thread parthchandra
Github user parthchandra commented on a diff in the pull request:

https://github.com/apache/drill/pull/1049#discussion_r153909807
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java
 ---
@@ -138,6 +138,8 @@
 return new 
ParquetFixedWidthDictionaryReaders.DictionaryBigIntReader(recordReader, 
allocateSize, descriptor, columnChunkMetaData, fixedLength, (BigIntVector) v, 
schemaElement);
--- End diff --

OK I'll try to add these. BTW, I realized that the test files that I added 
for the unit tests are not annotated, so I'll need to fix those as well!


---


[GitHub] drill issue #1056: DRILL-6000: Categorized graceful shutdown unit tests as S...

2017-11-29 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1056
  
@arina-ielchiieva 


---


[GitHub] drill pull request #1056: DRILL-6000: Categorized graceful shutdown unit tes...

2017-11-29 Thread ilooner
GitHub user ilooner opened a pull request:

https://github.com/apache/drill/pull/1056

DRILL-6000: Categorized graceful shutdown unit tests as SlowTests

Graceful shutdown unit tests were failing on Travis, and should not be run 
as part of the SmokeTests

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ilooner/drill DRILL-6000

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1056.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1056


commit 5ecaade4b5cdc91a3153a4e2394cfdd993eeb5cf
Author: Timothy Farkas 
Date:   2017-11-29T18:52:29Z

DRILL-6000: Categorized graceful shutdown unit tests as SlowTests




---


[GitHub] drill issue #1055: DRILL-5973 : Support injection of time-bound pauses in se...

2017-11-29 Thread kkhatua
Github user kkhatua commented on the issue:

https://github.com/apache/drill/pull/1055
  
@laurentgo / @parthchandra 
Please review this. It is the basis for unit tests in DRILL-3640


---


[GitHub] drill pull request #1055: DRILL-5973 : Support injection of time-bound pause...

2017-11-29 Thread kkhatua
GitHub user kkhatua opened a pull request:

https://github.com/apache/drill/pull/1055

DRILL-5973 : Support injection of time-bound pauses in server

Support pause injections in the test framework that are time-bound, to 
allow for testing high latency scenarios.
e.g. delayed server response to the Drill client allows for test a 
server-induced timeout
This would allow for testing of DRILL-3640 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kkhatua/drill DRILL-5973

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1055.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1055


commit 62e6f721183d648797d5329e94b277cd5722bba6
Author: Kunal Khatua 
Date:   2017-11-29T19:00:11Z

DRILL-5973 : Support injection of time-bound pauses in server

Support pause injections in the test framework that are time-bound, to 
allow for testing high latency scenarios.
e.g. delayed server response to the Drill client allows for test a 
server-induced timeout




---


[jira] [Created] (DRILL-6000) Graceful Shutdown Unit Tests Should Not Be Run On Travis

2017-11-29 Thread Timothy Farkas (JIRA)
Timothy Farkas created DRILL-6000:
-

 Summary: Graceful Shutdown Unit Tests Should Not Be Run On Travis
 Key: DRILL-6000
 URL: https://issues.apache.org/jira/browse/DRILL-6000
 Project: Apache Drill
  Issue Type: Bug
Reporter: Timothy Farkas
Assignee: Timothy Farkas






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [EXT] Re: Food for thought about intra-document operation

2017-11-29 Thread Aman Sinha
Damien,
for the intra-document operations, it would be useful to add support for
LATERAL joins (SQL standard), which in conjunction with UNNEST (or FLATTEN)
should address the use case you have.  I have filed a JIRA for this:
https://issues.apache.org/jira/browse/DRILL-5999.

-Aman

On Tue, Sep 26, 2017 at 12:04 AM, Damien Profeta  wrote:

> Hello Aman,
>
> AsterixDb seems to follow the standard SQL with a few minor modifications
> and add functions to ease aggregations (array_count, array_avg…)
>
> That would tend to confirm at least that the support of unnest is a good
> idea to improve Drill.
>
> Best regards
>
> Damien
>
> **
>
> On 09/25/2017 07:53 PM, Aman Sinha wrote:
>
>> Damien,
>> thanks for initiating the discussion..indeed this would be a very useful
>> enhancement.  Currently, Drill provides repeated_contains()  for filtering
>> and repeated_count() for count aggregates on arrays but not the general
>> purpose intra-document operations that you need based on your example.
>> I haven't gone through all the alternatives but in addition to what you
>> have described,  you might also want to look at SQL++ (
>> https://ci.apache.org/projects/asterixdb/sqlpp/manual.html) which has
>> been
>> adopted by AsterixDB and has syntax extensions to SQL for unstructured
>> data.
>>
>> -Aman
>>
>> On Mon, Sep 25, 2017 at 6:10 AM, Damien Profeta <
>> damien.prof...@amadeus.com>
>> wrote:
>>
>> Hello,
>>>
>>> A few format handled by Drill enable to work with document, meaning
>>> nested
>>> and repeated structure instead of just tables. Json and Parquet are the
>>> two
>>> that come to my mind right now. Document modeling is a great way to
>>> express
>>> complex object and is used a lot in my company. Drill is able to handle
>>> them but unfortunately, it cannot make much computation on it. By
>>> computation I mean, filtering branches of the document, computing
>>> statistics (avg, min, max) on part of the document … That would be very
>>> useful as an analytic tools.
>>>
>>> _What can be done_
>>>
>>> The question then is how to express the computation we want to do on the
>>> document. I have found multiple ways to handle that and I don't really
>>> know
>>> which one is the best hence the mail to expose what I have found to
>>> initiate discussion, maybe.
>>>
>>> First, in we look back at the Dremel paper which is the base of the
>>> parquet format and also one of the example for drill, dremel is adding
>>> the
>>> special keyword "WITHIN" to SQL to specify that the computation has to be
>>> done within a document. What is very powerful with this keyword is that
>>> it
>>> allows you to generate document and doesn't force you to flatten
>>> everything. You can find exemple of it usage in the google successor of
>>> Dremel: BigQuery and its documentation : https://cloud.google.com/bigqu
>>> ery/docs/legacy-nested-repeated.
>>>
>>> But it seems that it was problematic for Google, because they now propose
>>> a SQL that seems to be compliant with SQL 2011 for Bigquery to handle
>>> such
>>> computation. I am not familiar with SQL 2011 but it is told in BigQuery
>>> documentation to integrated the keywords for nested and repeated
>>> structure.
>>> You can have a view about how this is done in BigQuery here:
>>> https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays .
>>> Basically, what I have seen is that they leverage UNNEST and ARRAY
>>> keyword
>>> and then are able to use JOIN or CROSS JOIN to describe the aggregation.
>>>
>>> In Impala, they have added a way to add a subquery on a complex type in
>>> such a way that the subquery only act intra-document. I have no idea if
>>> this is standard SQL or not. In page https://www.cloudera.com/docum
>>> entation/enterprise/5-5-x/topics/impala_complex_types.html#complex_types
>>> look at the phrase: “The subquery labelled SUBQ1 is correlated:” for
>>> example.
>>>
>>> In Presto, you can apply lambda function to map/array to transform the
>>> structure and apply filter on it. So you have filter, map_filter function
>>> to filter array and map respectively. (cf https://prestodb.io/docs/curre
>>> nt/functions/lambda.html#filter)
>>>
>>> _Example_
>>>
>>> If I want to make a short example, let’s say we have a flight with a
>>> group
>>> of passengers in it. A document would be :
>>>
>>> { “flightnb”:1234, “group”:[{“age”:30,”gender”:”M
>>> ”},{“age”:15,”gender”:”F”},
>>> {“age”:10,”gender”:”F”},{“age”:30,”gender”:”F”}]}
>>>
>>> The database would be millions of such document and I want to know the
>>> average age of the male passenger for every flight.
>>>
>>> In Dremel, the query would be something like: select flightnb,
>>> avg(male_age) within record from (select groups.age as male_age from
>>> flight
>>> where group.gender = "M")
>>>
>>> With sql, it would be something like: select flightnb, avg(male_age) from
>>> (array(select g.age as male_age from unnest(group)as g where g.gender =
>>> "M") as male_age)
>>>
>>> With impala it would be something 

[jira] [Created] (DRILL-5999) Add support for LATERAL join

2017-11-29 Thread Aman Sinha (JIRA)
Aman Sinha created DRILL-5999:
-

 Summary: Add support for LATERAL join
 Key: DRILL-5999
 URL: https://issues.apache.org/jira/browse/DRILL-5999
 Project: Apache Drill
  Issue Type: New Feature
  Components: Query Planning & Optimization
Affects Versions: 1.11.0
Reporter: Aman Sinha


The LATERAL keyword in SQL standard can precede a sub-SELECT FROM item. This 
allows the sub-SELECT to refer to columns of FROM items that appear before it 
in the FROM list. (Without LATERAL, each sub-SELECT is evaluated independently 
and so cannot cross-reference any other FROM item.)  

Calcite supports the LATERAL syntax.  In Drill, we should add support for it in 
the planning and execution phase.  

The main motivation of supporting it is it makes it more expressive and 
performant to handling complex types such as arrays and maps.  For instance, 
suppose you have a customer table which contains 1 row per customer containing 
customer-id, name and an array of Orders corresponding to each customer.   
Suppose you want to find out for each customer what is the average order 
amount.  This could be expressed as follows using SQL standard LATERAL and 
UNNEST syntax:
{noformat}
SELECT customer_name FROM customers c 
   LATERAL (SELECT AVG(order_amount) FROM UNNEST(c.orders));
{noformat}

The subquery may contain other operations such as filtering etc which operate 
on the output of the  un-nested c.orders array.  The UNNEST operation is 
supported in Drill today using FLATTEN operator.  More details of the use cases 
for LATERAL is available from existing product documentations .. e.g see [1].   

[1] https://www.postgresql.org/docs/9.4/static/queries-table-expressions.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[VOTE] Release Apache Drill 1.12.0 - rc0

2017-11-29 Thread Arina Ielchiieva
Hi all,

I'd like to propose the first release candidate (rc0) of Apache Drill,
version 1.12.0.

The release candidate covers a total of 167 resolved JIRAs [1]. Thanks to
everyone who contributed to this release.

The tarball artifacts are hosted at [2] and the maven artifacts are hosted
at [3].

This release candidate is based on commit
54d3d201882ef5bc2e0f754fd10edfead9947b60 located at [4].

The vote ends at 3:00 PM UTC (7:00 AM PT), December 1, 2017.

[ ] +1
[ ] +0
[ ] -1

Here's my vote: +1


[1]
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820&version=12341087
[2] http://home.apache.org/~arina/drill/releases/1.12.0/rc0/
[3] https://repository.apache.org/content/repositories/orgapachedrill-1043/
[4] https://github.com/arina-ielchiieva/drill/commits/drill-1.12.0

Kind regards
Arina


Re: [DISCUSS] Drill 1.12.0 release

2017-11-29 Thread Arina Yelchiyeva
All pending Jiras are merged. Starting release candidate preparation.

*Note for the committers:*
*until the release is not over and Drill version is not changed to
1.13.0-SNAPSHOT, please do not push any changes into Drill master. *

Kind regards
Arina

On Mon, Nov 27, 2017 at 6:52 PM, Arina Yelchiyeva <
arina.yelchiy...@gmail.com> wrote:

> Current status:
>
> DRILL-4779: Kafka storage plugin support (developer - Anil & Kamesh, code
> reviewer - Paul) - fix is expected by the EOD in new pull request.
> DRILL-4286: Graceful shutdown of drillbit (developer - Jyothsna, code
> reviewer - Paul) - unit test failures are fixed. Unit test performance
> degraded x3 times!
>
> Kind regards
>
> On Sun, Nov 26, 2017 at 6:15 PM, Arina Yelchiyeva <
> arina.yelchiy...@gmail.com> wrote:
>
>> Current status:
>>
>> DRILL-4779: Kafka storage plugin support (developer - Anil & Kamesh, code
>> reviewer - Paul) - could not cherry-pick the commits. Needs fix.
>> DRILL-4286: Graceful shutdown of drillbit (developer - Jyothsna, code
>> reviewer - Paul) - there are unit test failures. Needs fix.
>>
>> Kind regards
>>
>> On Sat, Nov 25, 2017 at 11:53 PM, AnilKumar B 
>> wrote:
>>
>>> Hi Arina,
>>>
>>> Sorry for the delay. Just now we squashed Kafka storage plugin commits
>>> into
>>> one commit and pushed.
>>>
>>> Thanks & Regards,
>>> B Anil Kumar.
>>>
>>> On Sat, Nov 25, 2017 at 5:56 AM, Arina Yelchiyeva <
>>> arina.yelchiy...@gmail.com> wrote:
>>>
>>> > Current status:
>>> >
>>> > DRILL-4779: Kafka storage plugin support (developer - Anil & Kamesh,
>>> code
>>> > reviewer - Paul) - needs to squash the commits.
>>> > DRILL-4286: Graceful shutdown of drillbit (developer - Jyothsna, code
>>> > reviewer - Paul) - needs to address some code review comments.
>>> >
>>> > Kind regards
>>> > Arina
>>> >
>>> > On Wed, Nov 15, 2017 at 2:38 PM, Arina Yelchiyeva <
>>> > arina.yelchiy...@gmail.com> wrote:
>>> >
>>> > > Current status, we are close to the code freeze which will happen not
>>> > > later then the end of the next week.
>>> > >
>>> > > Blocker:
>>> > > DRILL-5917: Ban org.json:json library in Drill (developer - Vlad R.,
>>> code
>>> > > reviewer - Arina) - in progress.
>>> > >
>>> > > Targeted for 1.12 release:
>>> > > DRILL-4779: Kafka storage plugin support (developer - Anil & Kamesh,
>>> code
>>> > > reviewer - Paul) - needs next round of code review.
>>> > > DRILL-5943: Avoid the strong check introduced by DRILL-5582 for PLAIN
>>> > > mechanism (developer - Sorabh, code reviewer - Parth & Laurent) -
>>> waiting
>>> > > for Parth code review.
>>> > > DRILL-5771: Fix serDe errors for format plugins (developer - Arina,
>>> code
>>> > > reviewer - Tim) - code review is done, waiting for the merge.
>>> > >
>>> > > Kind regards
>>> > >
>>> > > On Fri, Nov 10, 2017 at 9:32 AM, Chunhui Shi  wrote:
>>> > >
>>> > >> Hi Arina,
>>> > >>
>>> > >>
>>> > >> Could we consider to include DRILL-5089 in 1.12.0? It is about lazy
>>> > >> loading schema for storage plugins. Could you or Paul take a look
>>> at the
>>> > >> pull request for this JIRA https://github.com/apache/dril
>>> l/pull/1032? I
>>> > >> think both of you are familiar with this part.
>>> > >>
>>> > >>
>>> > >> Thanks,
>>> > >>
>>> > >>
>>> > >> Chunhui
>>> > >>
>>> > >> 
>>> > >> From: Arina Yelchiyeva 
>>> > >> Sent: Thursday, November 9, 2017 8:11:35 AM
>>> > >> To: dev@drill.apache.org
>>> > >> Subject: Re: [DISCUSS] Drill 1.12.0 release
>>> > >>
>>> > >> Yes, they are already in master.
>>> > >>
>>> > >> On Thu, Nov 9, 2017 at 6:05 PM, Charles Givre 
>>> wrote:
>>> > >>
>>> > >> > We’re including the Networking functions in this release right?
>>> > >> >
>>> > >> > > On Nov 9, 2017, at 11:04, Arina Yelchiyeva <
>>> > >> arina.yelchiy...@gmail.com>
>>> > >> > wrote:
>>> > >> > >
>>> > >> > > If changes will be done before cut off date, targeting mid
>>> November
>>> > >> that
>>> > >> > it
>>> > >> > > will be possible to include this Jira.
>>> > >> > >
>>> > >> > > On Thu, Nov 9, 2017 at 6:03 PM, Charles Givre >> >
>>> > >> wrote:
>>> > >> > >
>>> > >> > >> Hi Arina,
>>> > >> > >> Can we include DRILL-4091 Support for additional GIS
>>> operations in
>>> > >> > version
>>> > >> > >> 1.12?  In general the code looked pretty good.  There was a
>>> unit
>>> > test
>>> > >> > >> missing which the developer submitted and some minor formatting
>>> > >> issues
>>> > >> > >> which I’m still waiting on.
>>> > >> > >> Thanks,
>>> > >> > >> —C
>>> > >> > >>
>>> > >> > >>
>>> > >> > >>
>>> > >> > >>> On Nov 9, 2017, at 10:58, Arina Yelchiyeva <
>>> > >> arina.yelchiy...@gmail.com
>>> > >> > >
>>> > >> > >> wrote:
>>> > >> > >>>
>>> > >> > >>> Current status:
>>> > >> > >>>
>>> > >> > >>> Blocker:
>>> > >> > >>> DRILL-5917: Ban org.json:json library in Drill (developer -
>>> Vlad
>>> > R.,
>>> > >> > code
>>> > >> > >>> reviewer - ?) - in progress.
>>> > >> > >>>
>>> > >> > >>> Targeted for 1.12 release:
>>> > >> > >>> DRILL-5337: OpenTSDB plugin (develo

[GitHub] drill pull request #1050: DRILL-5964: Do not allow queries to access paths o...

2017-11-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1050


---


[GitHub] drill pull request #1053: DRILL-5989 Travis Finally Runs Smoke Tests!!!

2017-11-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1053


---


[GitHub] drill pull request #1038: DRILL-5972: Slow performance for query on INFORMAT...

2017-11-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1038


---


[GitHub] drill pull request #921: DRILL-4286 Graceful shutdown of drillbit

2017-11-29 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/921


---


[GitHub] drill pull request #1049: DRILL-5971: Fix INT64, INT32 logical types in comp...

2017-11-29 Thread vvysotskyi
Github user vvysotskyi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1049#discussion_r153739725
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java
 ---
@@ -138,6 +138,8 @@
 return new 
ParquetFixedWidthDictionaryReaders.DictionaryBigIntReader(recordReader, 
allocateSize, descriptor, columnChunkMetaData, fixedLength, (BigIntVector) v, 
schemaElement);
--- End diff --

`DATE` logical type also encoded as the `INT32` physical type [1], so could 
you please also add its support?

[1] 
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#date


---


[GitHub] drill pull request #1049: DRILL-5971: Fix INT64, INT32 logical types in comp...

2017-11-29 Thread vvysotskyi
Github user vvysotskyi commented on a diff in the pull request:

https://github.com/apache/drill/pull/1049#discussion_r153735994
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/store/parquet/columnreaders/ColumnReaderFactory.java
 ---
@@ -138,6 +138,8 @@
 return new 
ParquetFixedWidthDictionaryReaders.DictionaryBigIntReader(recordReader, 
allocateSize, descriptor, columnChunkMetaData, fixedLength, (BigIntVector) v, 
schemaElement);
--- End diff --

This comment had to be placed in line 
[117](https://github.com/apache/drill/pull/1049/files?diff=unified#diff-4a7ec07122bfb16e4ff696af256f56dcR117),
 but I could not add it there. 
Should we also handle the case when `columnChunkMetaData.getType()` type is 
`INT32` and `convertedType` is `INT_32`? 


---