[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2023-06-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735764#comment-17735764
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1600977689

   #2810, #2809




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.21.0
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2023-06-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735428#comment-17735428
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

cgivre commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1599387786

   @kmatt A github issue is good!  Please be sure to tag @vvysotskyi in it as 
he was the original developer of this plugin.




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.21.0
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2023-06-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735427#comment-17735427
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1599386040

   @vvysotskyi https://issues.apache.org/jira/browse/DRILL-8442
   
   Should this be a GitHub issue, or is Jira the correct place for it?




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.21.0
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642681#comment-17642681
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1335803778

   @cgivre @vvysotskyi Thanks, I missed the "will be" clause ;)




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642591#comment-17642591
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

cgivre commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1335460790

   @kmatt This hasn't been implemented yet.   That's why the query doesn't yet 
work.  @vvysotskyi is working on that. :-) 




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642588#comment-17642588
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1335452201

   The version function seems not to parse:
   
   ```
   apache drill (dfs.delta)> select count(*) from 
table(dfs.delta.`delta_table`(type => 'delta'));
   ++
   | EXPR$0 |
   ++
   | 20 |
   ++
   1 row selected (0.157 seconds)
   
   apache drill (dfs.delta)> SELECT *
   2..semicolon> FROM table(dfs.delta.`delta_table`(type => 
'delta', version => 0));
   Error: VALIDATION ERROR: From line 2, column 22 to line 2, column 75: No 
match found for function signature delta_table(type => , version => 
)
   ```




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642321#comment-17642321
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

vvysotskyi commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1334839612

   Hi @kmatt, no, it is not supported yet, but will be added in the near 
future. The version will be specified using the table function. Here is the 
example query for it:
   ```sql
   SELECT *
   FROM table(dfs.delta.`/tmp/delta-table`(type => 'delta', version => 0));
   ```




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642248#comment-17642248
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1334708491

   @vvysotskyi Does this support VERSION AS OF queries?
   
   
https://docs.delta.io/latest/quick-start.html#read-older-versions-of-data-using-time-travel
   
   Ex: `SELECT * FROM dfs.delta.`/tmp/delta-table` VERSION AS OF 0;`
   




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641049#comment-17641049
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1331605371

   On Windows 10 `git clone` fails due to a path length in this patch. Repo 
clones successfully on Debian 11.
   
   ```
   git clone https://github.com/apache/drill.git
   Cloning into 'drill'...
   remote: Enumerating objects: 156537, done.
   remote: Counting objects: 100% (1323/1323), done.
   remote: Compressing objects: 100% (723/723), done.
   remote: Total 156537 (delta 322), reused 1119 (delta 218), pack-reused 
155214Receiving objects: 100% (156537/156537), 62.00 MiB | 11.15 MiBReceiving 
objects: 100% (156537/156537), 65.97 MiB | 11.24 MiB/s, done.
   
   Resolving deltas: 100% (79075/79075), done.
   fatal: cannot create directory at 
'contrib/format-deltalake/src/test/resources/data-reader-partition-values/as_int=0/as_long=0/as_byte=0/as_short=0/as_boolean=true/as_float=0.0/as_double=0.0/as_string=0/as_string_lit_null=null/as_date=2021-09-08/as_timestamp=2021-09-08
 11%3A11%3A11': Filename too long
   warning: Clone succeeded, but checkout failed.
   You can inspect what was checked out with 'git status'
   and retry with 'git restore --source=HEAD :/'
   ```
   




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-11-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636733#comment-17636733
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

cgivre merged PR #2702:
URL: https://github.com/apache/drill/pull/2702




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-11-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636136#comment-17636136
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

vvysotskyi commented on code in PR #2702:
URL: https://github.com/apache/drill/pull/2702#discussion_r1027065550


##
contrib/format-deltalake/src/main/java/org/apache/drill/exec/store/delta/plan/DrillExprToDeltaTranslator.java:
##
@@ -0,0 +1,246 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.delta.plan;
+
+import io.delta.standalone.expressions.And;
+import io.delta.standalone.expressions.EqualTo;
+import io.delta.standalone.expressions.Expression;
+import io.delta.standalone.expressions.GreaterThan;
+import io.delta.standalone.expressions.GreaterThanOrEqual;
+import io.delta.standalone.expressions.IsNotNull;
+import io.delta.standalone.expressions.IsNull;
+import io.delta.standalone.expressions.LessThan;
+import io.delta.standalone.expressions.LessThanOrEqual;
+import io.delta.standalone.expressions.Literal;
+import io.delta.standalone.expressions.Not;
+import io.delta.standalone.expressions.Or;
+import io.delta.standalone.expressions.Predicate;
+import io.delta.standalone.types.StructType;
+import org.apache.drill.common.FunctionNames;
+import org.apache.drill.common.expression.FunctionCall;
+import org.apache.drill.common.expression.LogicalExpression;
+import org.apache.drill.common.expression.PathSegment;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.expression.ValueExpressions;
+import org.apache.drill.common.expression.visitors.AbstractExprVisitor;
+
+public class DrillExprToDeltaTranslator extends 
AbstractExprVisitor {
+
+  private final StructType structType;
+
+  public DrillExprToDeltaTranslator(StructType structType) {
+this.structType = structType;
+  }
+
+  @Override
+  public Expression visitFunctionCall(FunctionCall call, Void value) {
+try {
+  return visitFunctionCall(call);
+} catch (Exception e) {
+  return null;
+}
+  }
+
+  private Predicate visitFunctionCall(FunctionCall call) {
+switch (call.getName()) {
+  case FunctionNames.AND: {
+Expression left = call.arg(0).accept(this, null);
+Expression right = call.arg(1).accept(this, null);
+if (left != null && right != null) {
+  return new And(left, right);
+}
+return null;
+  }
+  case FunctionNames.OR: {
+Expression left = call.arg(0).accept(this, null);
+Expression right = call.arg(1).accept(this, null);
+if (left != null && right != null) {
+  return new Or(left, right);
+}
+return null;
+  }
+  case FunctionNames.NOT: {
+Expression expression = call.arg(0).accept(this, null);
+if (expression != null) {
+  return new Not(expression);
+}
+return null;
+  }
+  case FunctionNames.IS_NULL: {
+LogicalExpression arg = call.arg(0);
+if (arg instanceof SchemaPath) {
+  String name = getPath((SchemaPath) arg);
+  return new IsNull(structType.column(name));
+}
+return null;
+  }
+  case FunctionNames.IS_NOT_NULL: {
+LogicalExpression arg = call.arg(0);
+if (arg instanceof SchemaPath) {
+  String name = getPath((SchemaPath) arg);
+  return new IsNotNull(structType.column(name));
+}
+return null;
+  }
+  case FunctionNames.LT: {
+LogicalExpression nameRef = call.arg(0);
+Expression expression = call.arg(1).accept(this, null);
+if (nameRef instanceof SchemaPath) {
+  String name = getPath((SchemaPath) nameRef);
+  return new LessThan(structType.column(name), expression);
+}
+return null;
+  }
+  case FunctionNames.LE: {
+LogicalExpression nameRef = call.arg(0);
+Expression expression = call.arg(1).accept(this, null);
+if (nameRef instanceof SchemaPath) {
+  String name = getPath((SchemaPath) nameRef);
+  return new LessThanOrEqual(structType.column(name), 

[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-11-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17634334#comment-17634334
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

jnturton commented on code in PR #2702:
URL: https://github.com/apache/drill/pull/2702#discussion_r1017953161


##
contrib/format-deltalake/README.md:
##
@@ -0,0 +1,36 @@
+# Delta Lake format plugin
+
+This format plugin enabled Drill to query Delta Lake tables.

Review Comment:
   ```suggestion
   This format plugin enables Drill to query Delta Lake tables.
   ```



##
contrib/format-deltalake/README.md:
##
@@ -0,0 +1,36 @@
+# Delta Lake format plugin
+
+This format plugin enabled Drill to query Delta Lake tables.
+
+## Supported optimizations and features
+
+### Project pushdown
+
+This format plugin supports project and filter pushdown optimizations.
+
+For the case of project pushdown, only columns specified in the query will be 
read, even they are nested columns.

Review Comment:
   ```suggestion
   For the case of project pushdown, only columns specified in the query will 
be read, even when they are nested columns.
   ```



##
contrib/format-deltalake/src/test/java/org/apache/drill/exec/store/delta/DeltaQueriesTest.java:
##
@@ -0,0 +1,195 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.delta;
+
+import org.apache.drill.common.logical.FormatPluginConfig;
+import org.apache.drill.common.logical.security.PlainCredentialsProvider;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.delta.format.DeltaFormatPluginConfig;
+import org.apache.drill.exec.store.dfs.FileSystemConfig;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.math.BigDecimal;
+import java.nio.file.Paths;
+import java.util.HashMap;
+import java.util.Map;
+
+import static 
org.apache.drill.exec.util.StoragePluginTestUtils.DFS_PLUGIN_NAME;
+import static org.junit.Assert.assertEquals;
+
+public class DeltaQueriesTest extends ClusterTest {
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+startCluster(ClusterFixture.builder(dirTestWatcher));
+
+StoragePluginRegistry pluginRegistry = 
cluster.drillbit().getContext().getStorage();
+FileSystemConfig pluginConfig = (FileSystemConfig) 
pluginRegistry.getPlugin(DFS_PLUGIN_NAME).getConfig();
+Map formats = new 
HashMap<>(pluginConfig.getFormats());
+formats.put("delta", new DeltaFormatPluginConfig());
+FileSystemConfig newPluginConfig = new FileSystemConfig(
+  pluginConfig.getConnection(),
+  pluginConfig.getConfig(),
+  pluginConfig.getWorkspaces(),
+  formats,
+  PlainCredentialsProvider.EMPTY_CREDENTIALS_PROVIDER);
+newPluginConfig.setEnabled(pluginConfig.isEnabled());
+pluginRegistry.put(DFS_PLUGIN_NAME, newPluginConfig);
+
+dirTestWatcher.copyResourceToRoot(Paths.get("data-reader-primitives"));
+
dirTestWatcher.copyResourceToRoot(Paths.get("data-reader-partition-values"));
+dirTestWatcher.copyResourceToRoot(Paths.get("data-reader-nested-struct"));
+  }
+
+  @Test
+  public void testSerDe() throws Exception {
+String plan = queryBuilder().sql("select * from 
dfs.`data-reader-partition-values`").explainJson();
+long count = queryBuilder().physical(plan).run().recordCount();
+assertEquals(3, count);
+  }
+
+  @Test
+  public void testAllPrimitives() throws Exception {
+testBuilder()
+  .sqlQuery("select * from dfs.`data-reader-primitives`")
+  .ordered()
+  .baselineColumns("as_int", "as_long", "as_byte", "as_short", 
"as_boolean", "as_float",
+"as_double", "as_string", "as_binary", "as_big_decimal")
+  .baselineValues(null, null, null, null, null, null, null, null, null, 
null)
+  .baselineValues(0, 0L, 0, 0, true, 0.0f, 0.0, "0", new byte[]{0, 0}, 
BigDecimal.valueOf(0))
+  .baselineValues(1, 1L, 1, 1, false, 1.0f, 1.0, "1", new byte[]{1, 1}, 
BigDecimal.valueOf(1))
+  .baselineValues(2, 2L, 2, 2, true, 2.0f, 2.0, "2", 

[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-11-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17630560#comment-17630560
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

vvysotskyi opened a new pull request, #2702:
URL: https://github.com/apache/drill/pull/2702

   # [DRILL-8353](https://issues.apache.org/jira/browse/DRILL-8353): Format 
plugin for Delta Lake
   
   ## Description
   This pull request adds support for reading delta lake tables.
   
   ## Documentation
   See README.md
   
   ## Testing
   Added unit tests.
   




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)