[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-17 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16441684#comment-16441684
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-382195795
 
 
   Hi @laurentgo
   
   Please hold off until I push some additional code changes based on the test 
cases. You can continue the code review once that is done. 
   
   Thanks.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-12 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16436765#comment-16436765
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-381014810
 
 
   Hi @laurentgo ,
   
   I have taken care of your following review comments -
   1. Used BaseAllocator instead of RootAllocator
   2. Added Calendar object as another argument to the API so that it gets used 
for Date, Time and Timestamp data values
   3. Handled NULL values for all the data types. I am basically using Nullable 
DataHolder objects as much as possible. Test cases need to be added though.
   4. For BigDecimal data type used getLong() API instead of getInt().
   5. Used StandardCharsets.UTF_8 as charset at places where we are doing 
string to bytes operation.
   
   Things that are still not done -
   1. I am unable to use the streaming approach for Blob and Clob as I couldn't 
figure out a way to really populate the Destination ArrowBuffer in a streaming 
manner.  
   2. I still need to take care of the precision of Timestamp values for 
Nano/Micro/Milli values.
   3. Array data type is not yet supported.
   4. For the "default" switch case, the control shouldn't get there. So I 
could throw Exception if that makes sense.
   
   Let me know your comments.
   
   Thanks. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434383#comment-16434383
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180855624
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/AbstractJdbcToArrowTest.java
 ##
 @@ -0,0 +1,66 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import java.sql.Connection;
+import java.sql.Statement;
+
+/**
+ * Class to abstract out some common test functionality for testing JDBC to 
Arrow.
+ */
+public abstract class AbstractJdbcToArrowTest {
+
+protected void createTestData(Connection conn, Table table) throws 
Exception {
+
+Statement stmt = null;
+try {
+//create the table and insert the data and once done drop the table
+stmt = conn.createStatement();
+stmt.executeUpdate(table.getCreate());
+
+for (String insert: table.getData()) {
+stmt.executeUpdate(insert);
+}
+
+} catch (Exception e) {
+e.printStackTrace();
+} finally {
 
 Review comment:
   Thanks @laurentgo for the comments. I should be able to revert soon with 
further changes. Still getting some work done from our India development team 
member on the test cases related changes. Let me ping you on Slack for any 
quick discussion.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434204#comment-16434204
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to convert 
Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-380521328
 
 
   @atuldambalkar I can be reached on slack if you need me


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434167#comment-16434167
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180815237
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,431 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+
+import com.google.common.base.Preconditions;
+import org.apache.arrow.vector.BaseFixedWidthVector;
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.math.BigDecimal;
+
+import java.nio.charset.StandardCharsets;
+import java.sql.Blob;
+import java.sql.Clob;
+import java.sql.Date;
+import java.sql.ResultSet;
+import java.sql.ResultSetMetaData;
+import java.sql.SQLException;
+import java.sql.Time;
+import java.sql.Timestamp;
+import java.sql.Types;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+Preconditions.checkNotNull(rsmd, "JDBC ResultSetMetaData object can't 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434161#comment-16434161
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180253135
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -200,144 +226,206 @@ public static void jdbcToArrowVectors(ResultSet rs, 
VectorSchemaRoot root) throw
 switch (rsmd.getColumnType(i)) {
 case Types.BOOLEAN:
 case Types.BIT:
-BitVector bitVector = (BitVector) 
root.getVector(columnName);
-bitVector.setSafe(rowCount, rs.getBoolean(i)? 1: 0);
-bitVector.setValueCount(rowCount + 1);
+updateVector((BitVector)root.getVector(columnName),
+rs.getBoolean(i), rowCount);
 break;
 case Types.TINYINT:
-TinyIntVector tinyIntVector = 
(TinyIntVector)root.getVector(columnName);
-tinyIntVector.setSafe(rowCount, rs.getInt(i));
-tinyIntVector.setValueCount(rowCount + 1);
+updateVector((TinyIntVector)root.getVector(columnName),
+rs.getInt(i), rowCount);
 break;
 case Types.SMALLINT:
-SmallIntVector smallIntVector = 
(SmallIntVector)root.getVector(columnName);
-smallIntVector.setSafe(rowCount, rs.getInt(i));
-smallIntVector.setValueCount(rowCount + 1);
+
updateVector((SmallIntVector)root.getVector(columnName),
+rs.getInt(i), rowCount);
 break;
 case Types.INTEGER:
-IntVector intVector = 
(IntVector)root.getVector(columnName);
-intVector.setSafe(rowCount, rs.getInt(i));
-intVector.setValueCount(rowCount + 1);
+updateVector((IntVector)root.getVector(columnName),
+rs.getInt(i), rowCount);
 break;
 case Types.BIGINT:
-BigIntVector bigIntVector = 
(BigIntVector)root.getVector(columnName);
-bigIntVector.setSafe(rowCount, rs.getInt(i));
-bigIntVector.setValueCount(rowCount + 1);
+updateVector((BigIntVector)root.getVector(columnName),
+rs.getInt(i), rowCount);
 
 Review comment:
   if bigint is a 64bits integer, it should probably use rs.getLong() (maybe 
have unit tests with large values, both positive and negative?)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434159#comment-16434159
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180811074
 
 

 ##
 File path: java/adapter/jdbc/pom.xml
 ##
 @@ -0,0 +1,95 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+org.apache.arrow
+arrow-java-root
+0.10.0-SNAPSHOT
+
+
+arrow-jdbc
+Arrow JDBC Adapter
+http://maven.apache.org
+
+
+
+
+org.apache.arrow
+arrow-memory
+${project.version}
+
+
+
+org.apache.arrow
+arrow-vector
+${project.version}
+
+
+com.google.guava
+guava
+18.0
+
+
+
+
+
+junit
+junit
+4.11
+test
+
+
+
+com.h2database
+h2
+1.4.196
+test
+
+
+com.fasterxml.jackson.dataformat
+jackson-dataformat-yaml
+2.7.9
+test
+
+
+com.fasterxml.jackson.core
+jackson-databind
+2.7.9
+test
+
+
+
+com.google.collections
 
 Review comment:
   That seems like a legacy library, before Guava was created...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434157#comment-16434157
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180249387
 
 

 ##
 File path: java/adapter/jdbc/pom.xml
 ##
 @@ -62,10 +68,11 @@
 2.7.9
 test
 
+
 
-com.google.guava
-guava
-18.0
+com.google.collections
 
 Review comment:
   isn't that deprecated in favor of guava? (last update is 2009...)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434163#comment-16434163
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180811190
 
 

 ##
 File path: java/adapter/jdbc/pom.xml
 ##
 @@ -0,0 +1,95 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+org.apache.arrow
+arrow-java-root
+0.10.0-SNAPSHOT
+
+
+arrow-jdbc
+Arrow JDBC Adapter
+http://maven.apache.org
+
+
+
+
+org.apache.arrow
+arrow-memory
+${project.version}
+
+
+
+org.apache.arrow
+arrow-vector
+${project.version}
+
+
+com.google.guava
+guava
+18.0
+
+
+
+
+
+junit
+junit
+4.11
 
 Review comment:
   replace with ${dep.junit.version}


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434169#comment-16434169
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180818053
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,431 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+
+import com.google.common.base.Preconditions;
+import org.apache.arrow.vector.BaseFixedWidthVector;
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.math.BigDecimal;
+
+import java.nio.charset.StandardCharsets;
+import java.sql.Blob;
+import java.sql.Clob;
+import java.sql.Date;
+import java.sql.ResultSet;
+import java.sql.ResultSetMetaData;
+import java.sql.SQLException;
+import java.sql.Time;
+import java.sql.Timestamp;
+import java.sql.Types;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+Preconditions.checkNotNull(rsmd, "JDBC ResultSetMetaData object can't 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434155#comment-16434155
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180249672
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -64,53 +68,48 @@
  * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
  *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
  * @param query The DB Query to fetch the data.
- * @return
- * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ * @return Arrow Data Objects {@link VectorSchemaRoot}
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statement objects.
  */
-public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
-
-assert connection != null: "JDBC conncetion object can not be null";
-assert query != null && query.length() > 0: "SQL query can not be null 
or empty";
-
-RootAllocator rootAllocator = new RootAllocator(Integer.MAX_VALUE);
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query, RootAllocator rootAllocator) throws SQLException {
+Preconditions.checkNotNull(connection, "JDBC connection object can not 
be null");
+Preconditions.checkArgument(query != null && query.length() > 0, "SQL 
query can not be null or empty");
 
-Statement stmt = null;
-ResultSet rs = null;
-try {
-stmt = connection.createStatement();
-rs = stmt.executeQuery(query);
-ResultSetMetaData rsmd = rs.getMetaData();
-VectorSchemaRoot root = VectorSchemaRoot.create(
-JdbcToArrowUtils.jdbcToArrowSchema(rsmd), rootAllocator);
-JdbcToArrowUtils.jdbcToArrowVectors(rs, root);
-return root;
-} catch (Exception exc) {
-// just throw it out after logging
-throw exc;
-} finally {
-if (rs != null) {
-rs.close();
-}
-if (stmt != null) {
-stmt.close(); // test
-}
+try (Statement stmt = connection.createStatement()) {
+return sqlToArrow(stmt.executeQuery(query), rootAllocator);
 }
 }
 
 /**
- * This method returns ArrowDataFetcher Object that can be used to fetch 
and iterate on the data in the given
- * database table.
- *
- * @param connection - Database connection Object
- * @param tableName - Table name from which records will be fetched
+ * For the given JDBC {@link ResultSet}, fetch the data from Relational DB 
and convert it to Arrow objects.
  *
- * @return ArrowDataFetcher - Instance of ArrowDataFetcher which can be 
used to get Arrow Vector obejcts by calling its functionality
+ * @param resultSet
+ * @return Arrow Data Objects {@link VectorSchemaRoot}
+ * @throws Exception
  */
-public static ArrowDataFetcher jdbcArrowDataFetcher(Connection connection, 
String tableName) {
-assert connection != null: "JDBC conncetion object can not be null";
-assert tableName != null && tableName.length() > 0: "Table name can 
not be null or empty";
+public static VectorSchemaRoot sqlToArrow(ResultSet resultSet) throws 
SQLException {
+Preconditions.checkNotNull(resultSet, "JDBC ResultSet object can not 
be null");
 
-return new ArrowDataFetcher(connection, tableName);
+RootAllocator rootAllocator = new RootAllocator(Integer.MAX_VALUE);
+VectorSchemaRoot root = sqlToArrow(resultSet, rootAllocator);
+rootAllocator.close();
+return root;
 }
 
+/**
+ * For the given JDBC {@link ResultSet}, fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param resultSet
+ * @return Arrow Data Objects {@link VectorSchemaRoot}
+ * @throws Exception
+ */
+public static VectorSchemaRoot sqlToArrow(ResultSet resultSet, 
RootAllocator rootAllocator) throws SQLException {
 
 Review comment:
   I know I mentioned RootAllocator, but I guess BufferAllocator (which is the 
base interface) would work as well?


This is an automated message from the Apache Git Service.
To respond to the message, please log on 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434168#comment-16434168
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180817457
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,431 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+
+import com.google.common.base.Preconditions;
+import org.apache.arrow.vector.BaseFixedWidthVector;
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.math.BigDecimal;
+
+import java.nio.charset.StandardCharsets;
+import java.sql.Blob;
+import java.sql.Clob;
+import java.sql.Date;
+import java.sql.ResultSet;
+import java.sql.ResultSetMetaData;
+import java.sql.SQLException;
+import java.sql.Time;
+import java.sql.Timestamp;
+import java.sql.Types;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+Preconditions.checkNotNull(rsmd, "JDBC ResultSetMetaData object can't 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434164#comment-16434164
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180818360
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/AbstractJdbcToArrowTest.java
 ##
 @@ -0,0 +1,66 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import java.sql.Connection;
+import java.sql.Statement;
+
+/**
+ * Class to abstract out some common test functionality for testing JDBC to 
Arrow.
+ */
+public abstract class AbstractJdbcToArrowTest {
+
+protected void createTestData(Connection conn, Table table) throws 
Exception {
+
+Statement stmt = null;
+try {
+//create the table and insert the data and once done drop the table
+stmt = conn.createStatement();
+stmt.executeUpdate(table.getCreate());
+
+for (String insert: table.getData()) {
+stmt.executeUpdate(insert);
+}
+
+} catch (Exception e) {
+e.printStackTrace();
+} finally {
 
 Review comment:
   you should use `try(with-resources)` construct instead...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434166#comment-16434166
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180810834
 
 

 ##
 File path: java/adapter/jdbc/pom.xml
 ##
 @@ -0,0 +1,95 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+org.apache.arrow
+arrow-java-root
+0.10.0-SNAPSHOT
+
+
+arrow-jdbc
+Arrow JDBC Adapter
+http://maven.apache.org
+
+
+
+
+org.apache.arrow
+arrow-memory
+${project.version}
+
+
+
+org.apache.arrow
+arrow-vector
+${project.version}
+
+
+com.google.guava
+guava
+18.0
+
+
+
+
+
+junit
+junit
+4.11
+test
+
+
+
+com.h2database
+h2
+1.4.196
+test
+
+
+com.fasterxml.jackson.dataformat
+jackson-dataformat-yaml
+2.7.9
+test
+
+
+com.fasterxml.jackson.core
+jackson-databind
+2.7.9
 
 Review comment:
   replace with ${dep.jackson.version}


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434156#comment-16434156
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180252798
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434160#comment-16434160
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180810807
 
 

 ##
 File path: java/adapter/jdbc/pom.xml
 ##
 @@ -0,0 +1,95 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+org.apache.arrow
+arrow-java-root
+0.10.0-SNAPSHOT
+
+
+arrow-jdbc
+Arrow JDBC Adapter
+http://maven.apache.org
+
+
+
+
+org.apache.arrow
+arrow-memory
+${project.version}
+
+
+
+org.apache.arrow
+arrow-vector
+${project.version}
+
+
+com.google.guava
+guava
+18.0
+
+
+
+
+
+junit
+junit
+4.11
+test
+
+
+
+com.h2database
+h2
+1.4.196
+test
+
+
+com.fasterxml.jackson.dataformat
+jackson-dataformat-yaml
+2.7.9
 
 Review comment:
   replace with ${dep.jackson.version}


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434165#comment-16434165
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180815032
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,431 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+
+import com.google.common.base.Preconditions;
+import org.apache.arrow.vector.BaseFixedWidthVector;
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.math.BigDecimal;
+
+import java.nio.charset.StandardCharsets;
+import java.sql.Blob;
+import java.sql.Clob;
+import java.sql.Date;
+import java.sql.ResultSet;
+import java.sql.ResultSetMetaData;
+import java.sql.SQLException;
+import java.sql.Time;
+import java.sql.Timestamp;
+import java.sql.Types;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+Preconditions.checkNotNull(rsmd, "JDBC ResultSetMetaData object can't 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434158#comment-16434158
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180810672
 
 

 ##
 File path: java/adapter/jdbc/pom.xml
 ##
 @@ -0,0 +1,95 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; 
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/xsd/maven-4.0.0.xsd;>
+4.0.0
+
+org.apache.arrow
+arrow-java-root
+0.10.0-SNAPSHOT
+
+
+arrow-jdbc
+Arrow JDBC Adapter
+http://maven.apache.org
+
+
+
+
+org.apache.arrow
+arrow-memory
+${project.version}
+
+
+
+org.apache.arrow
+arrow-vector
+${project.version}
+
+
+com.google.guava
+guava
+18.0
 
 Review comment:
   replace with ${dep.guava.version}


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16434162#comment-16434162
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180253328
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -200,144 +226,206 @@ public static void jdbcToArrowVectors(ResultSet rs, 
VectorSchemaRoot root) throw
 switch (rsmd.getColumnType(i)) {
 case Types.BOOLEAN:
 case Types.BIT:
-BitVector bitVector = (BitVector) 
root.getVector(columnName);
-bitVector.setSafe(rowCount, rs.getBoolean(i)? 1: 0);
-bitVector.setValueCount(rowCount + 1);
+updateVector((BitVector)root.getVector(columnName),
+rs.getBoolean(i), rowCount);
 break;
 case Types.TINYINT:
-TinyIntVector tinyIntVector = 
(TinyIntVector)root.getVector(columnName);
-tinyIntVector.setSafe(rowCount, rs.getInt(i));
-tinyIntVector.setValueCount(rowCount + 1);
+updateVector((TinyIntVector)root.getVector(columnName),
+rs.getInt(i), rowCount);
 break;
 case Types.SMALLINT:
-SmallIntVector smallIntVector = 
(SmallIntVector)root.getVector(columnName);
-smallIntVector.setSafe(rowCount, rs.getInt(i));
-smallIntVector.setValueCount(rowCount + 1);
+
updateVector((SmallIntVector)root.getVector(columnName),
+rs.getInt(i), rowCount);
 break;
 case Types.INTEGER:
-IntVector intVector = 
(IntVector)root.getVector(columnName);
-intVector.setSafe(rowCount, rs.getInt(i));
-intVector.setValueCount(rowCount + 1);
+updateVector((IntVector)root.getVector(columnName),
+rs.getInt(i), rowCount);
 break;
 case Types.BIGINT:
-BigIntVector bigIntVector = 
(BigIntVector)root.getVector(columnName);
-bigIntVector.setSafe(rowCount, rs.getInt(i));
-bigIntVector.setValueCount(rowCount + 1);
+updateVector((BigIntVector)root.getVector(columnName),
+rs.getInt(i), rowCount);
 break;
 case Types.NUMERIC:
 case Types.DECIMAL:
-DecimalVector decimalVector = 
(DecimalVector)root.getVector(columnName);
-decimalVector.setSafe(rowCount, rs.getBigDecimal(i));
-decimalVector.setValueCount(rowCount + 1);
+updateVector((DecimalVector)root.getVector(columnName),
+rs.getBigDecimal(i), rowCount);
 break;
 case Types.REAL:
 case Types.FLOAT:
-Float4Vector float4Vector = 
(Float4Vector)root.getVector(columnName);
-float4Vector.setSafe(rowCount, rs.getFloat(i));
-float4Vector.setValueCount(rowCount + 1);
+updateVector((Float4Vector)root.getVector(columnName),
+rs.getFloat(i), rowCount);
 break;
 case Types.DOUBLE:
-Float8Vector float8Vector = 
(Float8Vector)root.getVector(columnName);
-float8Vector.setSafe(rowCount, rs.getDouble(i));
-float8Vector.setValueCount(rowCount + 1);
+updateVector((Float8Vector)root.getVector(columnName),
+rs.getDouble(i), rowCount);
 break;
 case Types.CHAR:
 case Types.NCHAR:
 case Types.VARCHAR:
 case Types.NVARCHAR:
 case Types.LONGVARCHAR:
 case Types.LONGNVARCHAR:
-VarCharVector varcharVector = 
(VarCharVector)root.getVector(columnName);
-String value = rs.getString(i) != null ? 
rs.getString(i) : "";
-varcharVector.setIndexDefined(rowCount);
-

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431103#comment-16431103
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180205035
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16431015#comment-16431015
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r180185358
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428685#comment-16428685
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-379330225
 
 
   @donderom I recently did that change based on some earlier comments from 
@laurentgo  I have added that as another interface. So we are good!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-06 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16428410#comment-16428410
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

donderom commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to convert 
Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-379276243
 
 
   As I understand the idea is to convert `java.sql.ResultSet` to Arrow. The 
result set can be provided by 3-party lib what will make `sqlToArrow(Connection 
connection, String query)` API not usable. What about something like 
`sqlToArrow(ResultSet resultSet)`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424924#comment-16424924
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179015462
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
+assert query != null && query.length() > 0: "SQL query can not be null 
or empty";
+
+RootAllocator rootAllocator = new RootAllocator(Integer.MAX_VALUE);
+
+Statement stmt = null;
 
 Review comment:
   Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424921#comment-16424921
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179015368
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
+this.connection = connection;
+this.tableName = tableName;
+}
+
+/**
+ * Fetch the data from underlying table with the given limit and offset 
and for passed column names.
+ *
+ * @param offset
+ * @param limit
+ * @param columns
+ * @return
+ * @throws Exception
+ */
+public VectorSchemaRoot fetch(int offset, int limit, String... columns) 
throws Exception {
+assert columns != null && columns.length > 0 : "columns can't be 
empty!";
 
 Review comment:
   Fixed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424918#comment-16424918
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179015325
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
 
 Review comment:
   Fixed. Also changed this to use Preconditions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424917#comment-16424917
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179015269
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/Table.java
 ##
 @@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+/**
+ *
 
 Review comment:
   Added relevant doc comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424903#comment-16424903
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-378461698
 
 
   Hi @laurentgo  and @siddharthteotia I am still working on some of the code 
review changes. I have checked-in some code fixes. I will let you know once I 
am done with all the changes or if I need to have some discussion with you. 
Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424872#comment-16424872
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179009875
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424870#comment-16424870
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179009787
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424868#comment-16424868
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179009692
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424865#comment-16424865
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179009588
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424862#comment-16424862
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179009456
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424861#comment-16424861
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179009426
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424853#comment-16424853
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r179009203
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/Table.java
 ##
 @@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+/**
+ *
 
 Review comment:
   Will add a necessary comment.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-03 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16424685#comment-16424685
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178977178
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423151#comment-16423151
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178652817
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423149#comment-16423149
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178652242
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
+this.connection = connection;
+this.tableName = tableName;
+}
+
+/**
+ * Fetch the data from underlying table with the given limit and offset 
and for passed column names.
+ *
+ * @param offset
+ * @param limit
+ * @param columns
+ * @return
+ * @throws Exception
+ */
+public VectorSchemaRoot fetch(int offset, int limit, String... columns) 
throws Exception {
+assert columns != null && columns.length > 0 : "columns can't be 
empty!";
 
 Review comment:
   Yea, I here you!


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423141#comment-16423141
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178651658
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423125#comment-16423125
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178648346
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
 
 Review comment:
   That's my belief too...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423062#comment-16423062
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178636322
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
+this.connection = connection;
+this.tableName = tableName;
+}
+
+/**
+ * Fetch the data from underlying table with the given limit and offset 
and for passed column names.
+ *
+ * @param offset
+ * @param limit
+ * @param columns
+ * @return
+ * @throws Exception
+ */
+public VectorSchemaRoot fetch(int offset, int limit, String... columns) 
throws Exception {
+assert columns != null && columns.length > 0 : "columns can't be 
empty!";
 
 Review comment:
   Apart from the semantic difference (assertion are usually internal 
precondition of how the class behave, to help debugging), asserts are only 
turned on if enabled at the JDK level (using the -ea flag). Whereas 
preconditions are always checked.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423063#comment-16423063
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178636342
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
 
 Review comment:
   Aah, okay, that makes sense. I think in that case, we don't even need to 
worry about various databases and also testing against each of those.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423060#comment-16423060
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178635982
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
+assert query != null && query.length() > 0: "SQL query can not be null 
or empty";
+
+RootAllocator rootAllocator = new RootAllocator(Integer.MAX_VALUE);
 
 Review comment:
   If there's no way to automatically free buffers/close the allocator, you 
probably want to modify the existing function to take one as an input.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423057#comment-16423057
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178635650
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
 
 Review comment:
   My suggestion is to let the user do the query thing (get a connection, 
create the statement and execute it), and use the resulting {{ResultSet}} to do 
the Arrow conversation (and hopefully no need to deal with different dialects 
and other stuff)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423035#comment-16423035
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178629345
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
+assert query != null && query.length() > 0: "SQL query can not be null 
or empty";
+
+RootAllocator rootAllocator = new RootAllocator(Integer.MAX_VALUE);
 
 Review comment:
   Do you think it would be good to provide another overloaded API with 
RootAlocator as argument - 
   public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query, RootAllocator)
   or should I just modify the existing one?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423029#comment-16423029
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178628383
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
+this.connection = connection;
+this.tableName = tableName;
+}
+
+/**
+ * Fetch the data from underlying table with the given limit and offset 
and for passed column names.
+ *
+ * @param offset
+ * @param limit
+ * @param columns
+ * @return
+ * @throws Exception
+ */
+public VectorSchemaRoot fetch(int offset, int limit, String... columns) 
throws Exception {
+assert columns != null && columns.length > 0 : "columns can't be 
empty!";
 
 Review comment:
   Do you want me to use any particular Preconditions API such Guava's 
com.google.common.base.Preconditions? As "assert" is doing pretty much the same 
thing except it throws AssertionError as opposed to IllegalArgumentException.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423021#comment-16423021
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178626181
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
 
 Review comment:
   Yes, this is true, my bad! I was aware of this and should have thought 
through before implementing this. What I am thinking here now is to come up 
with a Java enum for all the databases and maintain a map or constant string 
with (limit/offset) query specific to each database. This way I can support - 
ORACLE_12C,MYSQL,  DB2,SQL_SERVER_2012,SQL_SERVER_2008,
POSTGRESQL, H2, SQLDB,  INGRES, DERBY,  SQLITE, CUBRID, 
SYBASE_ASE, SYBASE_SQL_ANYWHERE,FIREBIRD.  But the only problem here is 
in writing test cases. What do you think about this approach and if this is 
okay, how we can go about testing the code for each database?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423019#comment-16423019
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178626181
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
 
 Review comment:
   Yes, this is true. I was aware of this and should have thought through 
before implementing this. What I am thinking here now is to come up with a Java 
enum for all the databases and maintain a map or constant string with 
(limit/offset) query specific to each database. This way I can support - 
ORACLE_12C,MYSQL,  DB2,SQL_SERVER_2012,SQL_SERVER_2008,
POSTGRESQL, H2, SQLDB,  INGRES, DERBY,  SQLITE, CUBRID, 
SYBASE_ASE, SYBASE_SQL_ANYWHERE,FIREBIRD.  But the only problem here is 
in writing test cases. What do you think about this approach and if this is 
okay, how we can go about testing the code for each database?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16423015#comment-16423015
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r178625435
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
 
 Review comment:
   @laurentgo Can you elaborate it a bit - what do you mean by wrapping a 
ResultSet?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418166#comment-16418166
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-377045969
 
 
   Hi @laurentgo, now I do have a handful of review comments to work on. As I 
work on each one of those, some may need short discussion with you. I hope 
that's okay. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416450#comment-16416450
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-376706778
 
 
   Thanks @laurentgo and @siddharthteotia for all the review comments so far. 
Let me work on this and revert with my comments as I start working on the 
changes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416399#comment-16416399
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177591754
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416400#comment-16416400
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177592922
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/JdbcToArrowTestHelper.java
 ##
 @@ -0,0 +1,250 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+
+import static org.junit.Assert.*;
+
+
+/**
+ * This is a Helper class which has functionalities to read and assert the 
values from teh given FieldVector object
 
 Review comment:
   typo: teh -> the


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416404#comment-16416404
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177590101
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416397#comment-16416397
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177591421
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416403#comment-16416403
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177593622
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/JdbcToArrowTestHelper.java
 ##
 @@ -0,0 +1,250 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+
+import static org.junit.Assert.*;
+
+
+/**
+ * This is a Helper class which has functionalities to read and assert the 
values from teh given FieldVector object
+ *
+ */
+public class JdbcToArrowTestHelper {
+
+public static boolean assertIntVectorValues(FieldVector fx, int rowCount, 
int[] values) {
+IntVector intVector = ((IntVector) fx);
+
+assertEquals(rowCount, intVector.getValueCount());
+
+for(int j = 0; j < intVector.getValueCount(); j++) {
+if(!intVector.isNull(j)) {
+assertEquals(values[j], intVector.get(j));
+}
+}
+return true;
 
 Review comment:
   not sure what the boolean return value is for, since it's always `true`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416407#comment-16416407
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177597188
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/h2/ArrowDataFetcherTest.java
 ##
 @@ -0,0 +1,139 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc.h2;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.dataformat.yaml.YAMLFactory;
+import org.apache.arrow.adapter.jdbc.*;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.sql.Connection;
+import java.sql.DriverManager;
+
+import static org.junit.Assert.*;
+
+/**
+ * Test class for {@link ArrowDataFetcher}.
+ */
+public class ArrowDataFetcherTest extends AbstractJdbcToArrowTest {
+
+private Connection conn = null;
+private ObjectMapper mapper = null;
+
+@Before
+public void setUp() throws Exception {
+String url = "jdbc:h2:mem:ArrowDataFetcherTest";
+String driver = "org.h2.Driver";
+
+mapper = new ObjectMapper(new YAMLFactory());
+
+Class.forName(driver);
+
+conn = DriverManager.getConnection(url);
+}
+
+@After
+public void destroy() throws Exception {
+if (conn != null) {
+conn.close();
+conn = null;
+}
+}
+
+@Test
+public void commaSeparatedQueryColumnsTest() {
+try {
+ArrowDataFetcher.commaSeparatedQueryColumns(null);
+} catch (AssertionError error) {
+assertTrue(true);
+}
+assertEquals(" one ", 
ArrowDataFetcher.commaSeparatedQueryColumns("one"));
+assertEquals(" one, two ", 
ArrowDataFetcher.commaSeparatedQueryColumns("one", "two"));
+assertEquals(" one, two, three ", 
ArrowDataFetcher.commaSeparatedQueryColumns("one", "two", "three"));
+}
+
+@Test
+public void arrowFetcherAllColumnsLimitOffsetTest() throws Exception {
+
+Table table =
+mapper.readValue(
+
this.getClass().getClassLoader().getResourceAsStream("h2/test1_int_h2.yml"),
+Table.class);
+
+try {
+createTestData(conn, table);
+
+ArrowDataFetcher arrowDataFetcher = 
JdbcToArrow.jdbcArrowDataFetcher(conn, "table1");
+
+VectorSchemaRoot root = arrowDataFetcher.fetch(0, 10);
+
+int[] values = {
+101, 101, 101, 101, 101, 101, 101, 101, 101, 101
+};
+
JdbcToArrowTestHelper.assertIntVectorValues(root.getVector("INT_FIELD1"), 10, 
values);
+
+root = arrowDataFetcher.fetch(5, 5);
+
+
JdbcToArrowTestHelper.assertIntVectorValues(root.getVector("INT_FIELD1"), 5, 
values);
+
+} catch (Exception e) {
+e.printStackTrace();
+} finally {
+deleteTestData(conn, table);
 
 Review comment:
   since the connection is closed, that should trigger the in-memory db to be 
cleaned up (and not requiring a drop table...)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416417#comment-16416417
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177592821
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416416#comment-16416416
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177596733
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/h2/ArrowDataFetcherTest.java
 ##
 @@ -0,0 +1,139 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc.h2;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.dataformat.yaml.YAMLFactory;
+import org.apache.arrow.adapter.jdbc.*;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.sql.Connection;
+import java.sql.DriverManager;
+
+import static org.junit.Assert.*;
+
+/**
+ * Test class for {@link ArrowDataFetcher}.
+ */
+public class ArrowDataFetcherTest extends AbstractJdbcToArrowTest {
+
+private Connection conn = null;
+private ObjectMapper mapper = null;
+
+@Before
+public void setUp() throws Exception {
+String url = "jdbc:h2:mem:ArrowDataFetcherTest";
+String driver = "org.h2.Driver";
+
+mapper = new ObjectMapper(new YAMLFactory());
+
+Class.forName(driver);
+
+conn = DriverManager.getConnection(url);
+}
+
+@After
+public void destroy() throws Exception {
+if (conn != null) {
+conn.close();
+conn = null;
+}
+}
+
+@Test
+public void commaSeparatedQueryColumnsTest() {
+try {
+ArrowDataFetcher.commaSeparatedQueryColumns(null);
+} catch (AssertionError error) {
+assertTrue(true);
 
 Review comment:
   why this check? (it seems that basically it ignores the fact that the 
statement throws AssertionError, but it's not checking that it always fails 
either...)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416405#comment-16416405
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177591258
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416421#comment-16416421
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177591823
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416396#comment-16416396
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177589268
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
 
 Review comment:
   java good practices is to not use a wildcard import (at least not for static 
imports). The IDE should be able to expand automatically.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416408#comment-16416408
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177592278
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416393#comment-16416393
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177589848
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416415#comment-16416415
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177595960
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/h2/JdbcToArrowTest.java
 ##
 @@ -0,0 +1,325 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc.h2;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.dataformat.yaml.YAMLFactory;
+import org.apache.arrow.adapter.jdbc.AbstractJdbcToArrowTest;
+import org.apache.arrow.adapter.jdbc.JdbcToArrow;
+import org.apache.arrow.adapter.jdbc.Table;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.math.BigDecimal;
+import java.sql.Connection;
+import java.sql.DriverManager;
+import java.util.Properties;
+
+import static org.apache.arrow.adapter.jdbc.JdbcToArrowTestHelper.*;
+
+/**
+ *
+ */
+public class JdbcToArrowTest extends AbstractJdbcToArrowTest {
+
+private Connection conn = null;
+private ObjectMapper mapper = null;
+
+@Before
+public void setUp() throws Exception {
+String url = "jdbc:h2:mem:JdbcToArrowTest";
+String driver = "org.h2.Driver";
+
+mapper = new ObjectMapper(new YAMLFactory());
+
+Class.forName(driver);
+
+conn = DriverManager.getConnection(url);
+}
+
+@After
+public void destroy() throws Exception {
+if (conn != null) {
+conn.close();
+conn = null;
+}
+}
+
+@Test
+public void sqlToArrowTestInt() throws Exception {
 
 Review comment:
   suggestion for all the test comparing results...
   - put the type/data in the yaml file (instead of SQL statements)
   - generate the h2 schema + load data automatically into the db
   - refactor the test class to use `@Parameterized` with the list of YAML 
files to load


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416388#comment-16416388
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177588025
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
 
 Review comment:
   that's assuming that this is supported by all SQL dialects (which is not the 
case).
   Also convention for constants is to use `SNAKE_CASE`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416414#comment-16416414
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177593055
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416391#comment-16416391
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177589483
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416409#comment-16416409
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177588989
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
+assert query != null && query.length() > 0: "SQL query can not be null 
or empty";
+
+RootAllocator rootAllocator = new RootAllocator(Integer.MAX_VALUE);
+
+Statement stmt = null;
+ResultSet rs = null;
+try {
+stmt = connection.createStatement();
+rs = stmt.executeQuery(query);
+ResultSetMetaData rsmd = rs.getMetaData();
+VectorSchemaRoot root = VectorSchemaRoot.create(
+JdbcToArrowUtils.jdbcToArrowSchema(rsmd), rootAllocator);
+JdbcToArrowUtils.jdbcToArrowVectors(rs, root);
+return root;
+} catch (Exception exc) {
+// just throw it out after logging
+throw exc;
+} finally {
+if (rs != null) {
+rs.close();
+}
+if (stmt != null) {
+stmt.close(); // test
+}
+}
+}
+
+/**
+ * This method returns ArrowDataFetcher Object that can be used to fetch 
and iterate on the data in the given
+ * database table.
+ *
+ * @param connection - Database connection Object
+ * @param tableName - Table name from which records will be fetched
+ *
+ * @return ArrowDataFetcher - 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416402#comment-16416402
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177595348
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/h2/ArrowDataFetcherTest.java
 ##
 @@ -0,0 +1,139 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc.h2;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.dataformat.yaml.YAMLFactory;
+import org.apache.arrow.adapter.jdbc.*;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.sql.Connection;
+import java.sql.DriverManager;
+
+import static org.junit.Assert.*;
+
+/**
+ * Test class for {@link ArrowDataFetcher}.
+ */
+public class ArrowDataFetcherTest extends AbstractJdbcToArrowTest {
+
+private Connection conn = null;
+private ObjectMapper mapper = null;
+
+@Before
+public void setUp() throws Exception {
+String url = "jdbc:h2:mem:ArrowDataFetcherTest";
+String driver = "org.h2.Driver";
+
+mapper = new ObjectMapper(new YAMLFactory());
+
+Class.forName(driver);
+
+conn = DriverManager.getConnection(url);
+}
+
+@After
+public void destroy() throws Exception {
+if (conn != null) {
+conn.close();
+conn = null;
+}
+}
+
+@Test
+public void commaSeparatedQueryColumnsTest() {
 
 Review comment:
   suggestion for all the test comparing results...
   - put the data in the yaml file (instead of SQL statements)
   - generate the h2 schema + load data automatically into the db
   - refactor the test class to use `@Parameterized` with the list of YAML 
files to load


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416411#comment-16416411
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177594750
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/h2/ArrowDataFetcherTest.java
 ##
 @@ -0,0 +1,139 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc.h2;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.dataformat.yaml.YAMLFactory;
+import org.apache.arrow.adapter.jdbc.*;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.sql.Connection;
+import java.sql.DriverManager;
+
+import static org.junit.Assert.*;
+
+/**
+ * Test class for {@link ArrowDataFetcher}.
+ */
+public class ArrowDataFetcherTest extends AbstractJdbcToArrowTest {
+
+private Connection conn = null;
+private ObjectMapper mapper = null;
+
+@Before
+public void setUp() throws Exception {
+String url = "jdbc:h2:mem:ArrowDataFetcherTest";
+String driver = "org.h2.Driver";
+
+mapper = new ObjectMapper(new YAMLFactory());
+
+Class.forName(driver);
 
 Review comment:
   not sure that's necessary (h2 should have the right service metadata file)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416394#comment-16416394
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177588568
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
+assert query != null && query.length() > 0: "SQL query can not be null 
or empty";
+
+RootAllocator rootAllocator = new RootAllocator(Integer.MAX_VALUE);
 
 Review comment:
   Shouldn't the allocate be provided, so that the caller has control over it? 
as of now, it cannot be closed once you're done with the buffers...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416389#comment-16416389
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177587817
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
 
 Review comment:
   Why not wrapping a ResultSet instead?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416387#comment-16416387
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177588356
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
+this.connection = connection;
+this.tableName = tableName;
+}
+
+/**
+ * Fetch the data from underlying table with the given limit and offset 
and for passed column names.
+ *
+ * @param offset
+ * @param limit
+ * @param columns
+ * @return
+ * @throws Exception
+ */
+public VectorSchemaRoot fetch(int offset, int limit, String... columns) 
throws Exception {
+assert columns != null && columns.length > 0 : "columns can't be 
empty!";
 
 Review comment:
   shouldn't `Preconditions` be used instead? it's not an internal assertion 
but an API contract, isn't it?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416386#comment-16416386
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177587483
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/ArrowDataFetcher.java
 ##
 @@ -0,0 +1,107 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.Connection;
+
+/**
+ * Class to fetch data from a given database table where user can specify 
columns to fetch
+ * along with limit and offset parameters.
+ *
+ * The object of this class is returned by invoking method 
jdbcArrowDataFetcher(Connection connection, String tableName)
+ * from {@link JdbcToArrow} class. Caller can use this object to fetch data 
repetitively based on the
+ * data fetch requirement and can implement pagination like functionality.
+ *
+ * This class doesn't hold any open connections to database but simply 
executes the "select" query everytime with
+ * the necessary limit and offset parameters.
+ *
+ * @since 0.10.0
+ * @see JdbcToArrow
+ */
+public class ArrowDataFetcher {
+
+private static final String all_columns_query = "select * from %s limit %d 
offset %d";
+private static final String custom_columns_query = "select %s from %s 
limit %d offset %d";
+private Connection connection;
+private String tableName;
+
+/**
+ * Constructor
+ * @param connection
+ * @param tableName
+ */
+public ArrowDataFetcher(Connection connection, String tableName) {
+this.connection = connection;
+this.tableName = tableName;
+}
+
+/**
+ * Fetch the data from underlying table with the given limit and offset 
and for passed column names.
+ *
+ * @param offset
+ * @param limit
+ * @param columns
+ * @return
+ * @throws Exception
+ */
+public VectorSchemaRoot fetch(int offset, int limit, String... columns) 
throws Exception {
 
 Review comment:
   Exception is very broad, should we try to be more specific?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416410#comment-16416410
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177594176
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/JdbcToArrowTestHelper.java
 ##
 @@ -0,0 +1,250 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+
+import static org.junit.Assert.*;
+
+
+/**
+ * This is a Helper class which has functionalities to read and assert the 
values from teh given FieldVector object
+ *
+ */
+public class JdbcToArrowTestHelper {
+
+public static boolean assertIntVectorValues(FieldVector fx, int rowCount, 
int[] values) {
+IntVector intVector = ((IntVector) fx);
 
 Review comment:
   (style) superfluous parenthesis. (Also you might want to manage the cast by 
the caller, not the callee?)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416419#comment-16416419
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177594008
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/JdbcToArrowTestHelper.java
 ##
 @@ -0,0 +1,250 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+
+import static org.junit.Assert.*;
+
+
+/**
+ * This is a Helper class which has functionalities to read and assert the 
values from teh given FieldVector object
+ *
+ */
+public class JdbcToArrowTestHelper {
+
+public static boolean assertIntVectorValues(FieldVector fx, int rowCount, 
int[] values) {
+IntVector intVector = ((IntVector) fx);
+
+assertEquals(rowCount, intVector.getValueCount());
+
+for(int j = 0; j < intVector.getValueCount(); j++) {
 
 Review comment:
   what if `values.length` doesn't match `rowCount`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416420#comment-16416420
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177595647
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/h2/ArrowDataFetcherTest.java
 ##
 @@ -0,0 +1,139 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc.h2;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.dataformat.yaml.YAMLFactory;
+import org.apache.arrow.adapter.jdbc.*;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.sql.Connection;
+import java.sql.DriverManager;
+
+import static org.junit.Assert.*;
+
+/**
+ * Test class for {@link ArrowDataFetcher}.
+ */
+public class ArrowDataFetcherTest extends AbstractJdbcToArrowTest {
+
+private Connection conn = null;
+private ObjectMapper mapper = null;
+
+@Before
+public void setUp() throws Exception {
+String url = "jdbc:h2:mem:ArrowDataFetcherTest";
+String driver = "org.h2.Driver";
+
+mapper = new ObjectMapper(new YAMLFactory());
+
+Class.forName(driver);
+
+conn = DriverManager.getConnection(url);
+}
+
+@After
+public void destroy() throws Exception {
+if (conn != null) {
+conn.close();
+conn = null;
+}
+}
+
+@Test
+public void commaSeparatedQueryColumnsTest() {
+try {
+ArrowDataFetcher.commaSeparatedQueryColumns(null);
+} catch (AssertionError error) {
+assertTrue(true);
+}
+assertEquals(" one ", 
ArrowDataFetcher.commaSeparatedQueryColumns("one"));
+assertEquals(" one, two ", 
ArrowDataFetcher.commaSeparatedQueryColumns("one", "two"));
+assertEquals(" one, two, three ", 
ArrowDataFetcher.commaSeparatedQueryColumns("one", "two", "three"));
+}
+
+@Test
+public void arrowFetcherAllColumnsLimitOffsetTest() throws Exception {
+
+Table table =
+mapper.readValue(
+
this.getClass().getClassLoader().getResourceAsStream("h2/test1_int_h2.yml"),
+Table.class);
+
+try {
+createTestData(conn, table);
+
+ArrowDataFetcher arrowDataFetcher = 
JdbcToArrow.jdbcArrowDataFetcher(conn, "table1");
+
+VectorSchemaRoot root = arrowDataFetcher.fetch(0, 10);
+
+int[] values = {
+101, 101, 101, 101, 101, 101, 101, 101, 101, 101
+};
+
JdbcToArrowTestHelper.assertIntVectorValues(root.getVector("INT_FIELD1"), 10, 
values);
+
+root = arrowDataFetcher.fetch(5, 5);
+
+
JdbcToArrowTestHelper.assertIntVectorValues(root.getVector("INT_FIELD1"), 5, 
values);
+
+} catch (Exception e) {
+e.printStackTrace();
 
 Review comment:
   this test will not error out in case of exception...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416395#comment-16416395
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177589044
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
+assert query != null && query.length() > 0: "SQL query can not be null 
or empty";
+
+RootAllocator rootAllocator = new RootAllocator(Integer.MAX_VALUE);
+
+Statement stmt = null;
+ResultSet rs = null;
+try {
+stmt = connection.createStatement();
+rs = stmt.executeQuery(query);
+ResultSetMetaData rsmd = rs.getMetaData();
+VectorSchemaRoot root = VectorSchemaRoot.create(
+JdbcToArrowUtils.jdbcToArrowSchema(rsmd), rootAllocator);
+JdbcToArrowUtils.jdbcToArrowVectors(rs, root);
+return root;
+} catch (Exception exc) {
+// just throw it out after logging
+throw exc;
+} finally {
+if (rs != null) {
+rs.close();
+}
+if (stmt != null) {
+stmt.close(); // test
+}
+}
+}
+
+/**
+ * This method returns ArrowDataFetcher Object that can be used to fetch 
and iterate on the data in the given
 
 Review comment:
   Object -> object (same in the argument description)


This is an automated message from the Apache Git Service.
To respond 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416413#comment-16416413
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177592633
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416412#comment-16416412
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177593417
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/JdbcToArrowTestHelper.java
 ##
 @@ -0,0 +1,250 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+
+import static org.junit.Assert.*;
+
+
+/**
+ * This is a Helper class which has functionalities to read and assert the 
values from teh given FieldVector object
+ *
+ */
+public class JdbcToArrowTestHelper {
+
+public static boolean assertIntVectorValues(FieldVector fx, int rowCount, 
int[] values) {
+IntVector intVector = ((IntVector) fx);
+
+assertEquals(rowCount, intVector.getValueCount());
+
+for(int j = 0; j < intVector.getValueCount(); j++) {
+if(!intVector.isNull(j)) {
+assertEquals(values[j], intVector.get(j));
+}
+}
+return true;
+}
+
+public static boolean assertBitBooleanVectorValues(FieldVector fx, int 
rowCount, int[] values){
+BitVector bitVector = ((BitVector)fx);
+assertEquals(rowCount, bitVector.getValueCount());
+for(int j = 0; j < bitVector.getValueCount(); j++){
+if(!bitVector.isNull(j)) {
+assertEquals(values[j], bitVector.get(j));
+}
+}
+return true;
+}
+
+public static boolean assertTinyIntVectorValues(FieldVector fx, int 
rowCount, int[] values){
+TinyIntVector tinyIntVector = ((TinyIntVector)fx);
+
+assertEquals(rowCount, tinyIntVector.getValueCount());
+
+for(int j = 0; j < tinyIntVector.getValueCount(); j++){
+if(!tinyIntVector.isNull(j)) {
+assertEquals(values[j], tinyIntVector.get(j));
+}
+}
+return true;
+}
+
+public static boolean assertSmallIntVectorValues(FieldVector fx, int 
rowCount, int[] values){
+SmallIntVector smallIntVector = ((SmallIntVector)fx);
+
+assertEquals(rowCount, smallIntVector.getValueCount());
+
+for(int j = 0; j < smallIntVector.getValueCount(); j++){
+if(!smallIntVector.isNull(j)){
+assertEquals(values[j], smallIntVector.get(j));
+}
+}
+
+return true;
+}
+
+public static boolean assertBigIntVectorValues(FieldVector fx, int 
rowCount, int[] values){
+BigIntVector bigIntVector = ((BigIntVector)fx);
+
+assertEquals(rowCount, bigIntVector.getValueCount());
+
+for(int j = 0; j < bigIntVector.getValueCount(); j++){
+if(!bigIntVector.isNull(j)) {
+assertEquals(values[j], bigIntVector.get(j));
+}
+}
+
+return true;
+}
+
+public static boolean assertDecimalVectorValues(FieldVector fx, int 
rowCount, BigDecimal[] values){
+DecimalVector decimalVector = ((DecimalVector)fx);
+
+assertEquals(rowCount, decimalVector.getValueCount());
+
+for(int j = 0; j < decimalVector.getValueCount(); j++){
+if(!decimalVector.isNull(j)){
+

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416390#comment-16416390
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177588748
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
+assert query != null && query.length() > 0: "SQL query can not be null 
or empty";
+
+RootAllocator rootAllocator = new RootAllocator(Integer.MAX_VALUE);
+
+Statement stmt = null;
 
 Review comment:
   it should be possible to use `try(resource initializations) { }` and get rid 
of the finally clause


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416401#comment-16416401
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177593874
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/JdbcToArrowTestHelper.java
 ##
 @@ -0,0 +1,250 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+
+import static org.junit.Assert.*;
+
+
+/**
+ * This is a Helper class which has functionalities to read and assert the 
values from teh given FieldVector object
+ *
+ */
+public class JdbcToArrowTestHelper {
+
+public static boolean assertIntVectorValues(FieldVector fx, int rowCount, 
int[] values) {
+IntVector intVector = ((IntVector) fx);
+
+assertEquals(rowCount, intVector.getValueCount());
+
+for(int j = 0; j < intVector.getValueCount(); j++) {
+if(!intVector.isNull(j)) {
 
 Review comment:
   if `intVector.isNull()` returns true, shouldn't we fail the assertion? (or 
maybe take an Integer[] array instead?)


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416398#comment-16416398
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177591507
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416406#comment-16416406
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177594286
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/JdbcToArrowTestHelper.java
 ##
 @@ -0,0 +1,250 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.vector.BigIntVector;
+import org.apache.arrow.vector.BitVector;
+import org.apache.arrow.vector.DateMilliVector;
+import org.apache.arrow.vector.DecimalVector;
+import org.apache.arrow.vector.FieldVector;
+import org.apache.arrow.vector.Float4Vector;
+import org.apache.arrow.vector.Float8Vector;
+import org.apache.arrow.vector.IntVector;
+import org.apache.arrow.vector.SmallIntVector;
+import org.apache.arrow.vector.TimeMilliVector;
+import org.apache.arrow.vector.TimeStampVector;
+import org.apache.arrow.vector.TinyIntVector;
+import org.apache.arrow.vector.VarBinaryVector;
+import org.apache.arrow.vector.VarCharVector;
+
+import static org.junit.Assert.*;
+
+
+/**
+ * This is a Helper class which has functionalities to read and assert the 
values from teh given FieldVector object
+ *
+ */
+public class JdbcToArrowTestHelper {
+
+public static boolean assertIntVectorValues(FieldVector fx, int rowCount, 
int[] values) {
+IntVector intVector = ((IntVector) fx);
+
+assertEquals(rowCount, intVector.getValueCount());
+
+for(int j = 0; j < intVector.getValueCount(); j++) {
+if(!intVector.isNull(j)) {
+assertEquals(values[j], intVector.get(j));
+}
+}
+return true;
+}
+
+public static boolean assertBitBooleanVectorValues(FieldVector fx, int 
rowCount, int[] values){
+BitVector bitVector = ((BitVector)fx);
+assertEquals(rowCount, bitVector.getValueCount());
+for(int j = 0; j < bitVector.getValueCount(); j++){
+if(!bitVector.isNull(j)) {
+assertEquals(values[j], bitVector.get(j));
+}
+}
+return true;
+}
+
+public static boolean assertTinyIntVectorValues(FieldVector fx, int 
rowCount, int[] values){
+TinyIntVector tinyIntVector = ((TinyIntVector)fx);
+
+assertEquals(rowCount, tinyIntVector.getValueCount());
+
+for(int j = 0; j < tinyIntVector.getValueCount(); j++){
+if(!tinyIntVector.isNull(j)) {
+assertEquals(values[j], tinyIntVector.get(j));
+}
+}
+return true;
+}
+
+public static boolean assertSmallIntVectorValues(FieldVector fx, int 
rowCount, int[] values){
+SmallIntVector smallIntVector = ((SmallIntVector)fx);
+
+assertEquals(rowCount, smallIntVector.getValueCount());
+
+for(int j = 0; j < smallIntVector.getValueCount(); j++){
+if(!smallIntVector.isNull(j)){
+assertEquals(values[j], smallIntVector.get(j));
+}
+}
+
+return true;
+}
+
+public static boolean assertBigIntVectorValues(FieldVector fx, int 
rowCount, int[] values){
+BigIntVector bigIntVector = ((BigIntVector)fx);
+
+assertEquals(rowCount, bigIntVector.getValueCount());
+
+for(int j = 0; j < bigIntVector.getValueCount(); j++){
+if(!bigIntVector.isNull(j)) {
+assertEquals(values[j], bigIntVector.get(j));
+}
+}
+
+return true;
+}
+
+public static boolean assertDecimalVectorValues(FieldVector fx, int 
rowCount, BigDecimal[] values){
+DecimalVector decimalVector = ((DecimalVector)fx);
+
+assertEquals(rowCount, decimalVector.getValueCount());
+
+for(int j = 0; j < decimalVector.getValueCount(); j++){
+if(!decimalVector.isNull(j)){
+

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416418#comment-16416418
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177596898
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/h2/ArrowDataFetcherTest.java
 ##
 @@ -0,0 +1,139 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc.h2;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.dataformat.yaml.YAMLFactory;
+import org.apache.arrow.adapter.jdbc.*;
+import org.apache.arrow.vector.VectorSchemaRoot;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+import java.sql.Connection;
+import java.sql.DriverManager;
+
+import static org.junit.Assert.*;
+
+/**
+ * Test class for {@link ArrowDataFetcher}.
+ */
+public class ArrowDataFetcherTest extends AbstractJdbcToArrowTest {
+
+private Connection conn = null;
+private ObjectMapper mapper = null;
+
+@Before
+public void setUp() throws Exception {
+String url = "jdbc:h2:mem:ArrowDataFetcherTest";
+String driver = "org.h2.Driver";
+
+mapper = new ObjectMapper(new YAMLFactory());
+
+Class.forName(driver);
+
+conn = DriverManager.getConnection(url);
+}
+
+@After
+public void destroy() throws Exception {
+if (conn != null) {
+conn.close();
+conn = null;
+}
+}
+
+@Test
+public void commaSeparatedQueryColumnsTest() {
 
 Review comment:
   since this is not using any connection, it might be moved in a separate test 
class with no setup required?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416392#comment-16416392
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177589405
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrowUtils.java
 ##
 @@ -0,0 +1,343 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.vector.*;
+import org.apache.arrow.vector.types.DateUnit;
+import org.apache.arrow.vector.types.TimeUnit;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.types.pojo.Schema;
+
+import java.nio.charset.Charset;
+import java.sql.*;
+import java.util.ArrayList;
+import java.util.List;
+
+import static org.apache.arrow.vector.types.FloatingPointPrecision.DOUBLE;
+import static org.apache.arrow.vector.types.FloatingPointPrecision.SINGLE;
+
+
+/**
+ * Class that does most of the work to convert JDBC ResultSet data into Arrow 
columnar format Vector objects.
+ *
+ * @since 0.10.0
+ */
+public class JdbcToArrowUtils {
+
+private static final int DEFAULT_BUFFER_SIZE = 256;
+
+/**
+ * Create Arrow {@link Schema} object for the given JDBC {@link 
ResultSetMetaData}.
+ *
+ * This method currently performs following type mapping for JDBC SQL data 
types to corresponding Arrow data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @param rsmd
+ * @return {@link Schema}
+ * @throws SQLException
+ */
+public static Schema jdbcToArrowSchema(ResultSetMetaData rsmd) throws 
SQLException {
+
+assert rsmd != null;
+
+//ImmutableList.Builder fields = ImmutableList.builder();
+List fields = new ArrayList<>();
+int columnCount = rsmd.getColumnCount();
+for (int i = 1; i <= columnCount; i++) {
+String columnName = rsmd.getColumnName(i);
+switch (rsmd.getColumnType(i)) {
+case Types.BOOLEAN:
+case Types.BIT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Bool()), null));
+break;
+case Types.TINYINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(8, true)), null));
+break;
+case Types.SMALLINT:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(16, true)), null));
+break;
+case Types.INTEGER:
+fields.add(new Field(columnName, FieldType.nullable(new 
ArrowType.Int(32, true)), null));
+break;
+case 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416318#comment-16416318
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177587078
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
 
 Review comment:
   typo...


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416319#comment-16416319
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

laurentgo commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177587078
 
 

 ##
 File path: 
java/adapter/jdbc/src/main/java/org/apache/arrow/adapter/jdbc/JdbcToArrow.java
 ##
 @@ -0,0 +1,116 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+import org.apache.arrow.memory.RootAllocator;
+import org.apache.arrow.vector.VectorSchemaRoot;
+
+import java.sql.*;
+
+/**
+ * Utility class to convert JDBC objects to columnar Arrow format objects.
+ *
+ * This utility uses following data mapping to map JDBC/SQL datatype to Arrow 
data types.
+ *
+ * CHAR--> ArrowType.Utf8
+ * NCHAR   --> ArrowType.Utf8
+ * VARCHAR --> ArrowType.Utf8
+ * NVARCHAR --> ArrowType.Utf8
+ * LONGVARCHAR --> ArrowType.Utf8
+ * LONGNVARCHAR --> ArrowType.Utf8
+ * NUMERIC --> ArrowType.Decimal(precision, scale)
+ * DECIMAL --> ArrowType.Decimal(precision, scale)
+ * BIT --> ArrowType.Bool
+ * TINYINT --> ArrowType.Int(8, signed)
+ * SMALLINT --> ArrowType.Int(16, signed)
+ * INTEGER --> ArrowType.Int(32, signed)
+ * BIGINT --> ArrowType.Int(64, signed)
+ * REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
+ * DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
+ * BINARY --> ArrowType.Binary
+ * VARBINARY --> ArrowType.Binary
+ * LONGVARBINARY --> ArrowType.Binary
+ * DATE --> ArrowType.Date(DateUnit.MILLISECOND)
+ * TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
+ * TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
+ * CLOB --> ArrowType.Utf8
+ * BLOB --> ArrowType.Binary
+ *
+ * @since 0.10.0
+ * @see ArrowDataFetcher
+ */
+public class JdbcToArrow {
+
+/**
+ * For the given SQL query, execute and fetch the data from Relational DB 
and convert it to Arrow objects.
+ *
+ * @param connection Database connection to be used. This method will not 
close the passed connection object. Since hte caller has passed
+ *   the connection object it's the responsibility of the 
caller to close or return the connection to the pool.
+ * @param query The DB Query to fetch the data.
+ * @return
+ * @throws SQLException Propagate any SQL Exceptions to the caller after 
closing any resources opened such as ResultSet and Statment objects.
+ */
+public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query) throws Exception {
+
+assert connection != null: "JDBC conncetion object can not be null";
 
 Review comment:
   typo in the assert message


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> 

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416020#comment-16416020
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

siddharthteotia commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-376620084
 
 
   Going through the changes. Will finish reviewing soon.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416019#comment-16416019
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

siddharthteotia commented on a change in pull request #1759: ARROW-1780 - [WIP] 
JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector 
Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r177517745
 
 

 ##
 File path: 
java/adapter/jdbc/src/test/java/org/apache/arrow/adapter/jdbc/Table.java
 ##
 @@ -0,0 +1,74 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.adapter.jdbc;
+
+/**
+ *
 
 Review comment:
   I think class description is missing.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416005#comment-16416005
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

wesm commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to convert 
Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-376617996
 
 
   @atuldambalkar since this is a large PR, and we haven't had deep feedback 
from anyone focused on the Java implementation yet, it may be a little while 
for some review to come through. I would suggest bumping the mailing list 
thread about JDBC to draw more attention to this PR


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16416006#comment-16416006
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

siddharthteotia commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-376618019
 
 
   It would be good to add (or point to) simple example pieces of code to show 
the usage of JDBC adapter code. This would help users on how to go about 
writing an application that converts JDBC result sets to Arrow column vectors.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415991#comment-16415991
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-376613506
 
 
   Any update on this PR? I would be interested to know if there any further 
review comments.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412260#comment-16412260
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar commented on issue #1759: ARROW-1780 - [WIP] JDBC Adapter to 
convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#issuecomment-375825688
 
 
   Uwe, 
   I have updated the code based on your comments. Also merged with latest 
0.10.0-SNAPSHOT.  Let's wait if someone from Java side can review this. I will 
send a message on Slack. Thanks.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409960#comment-16409960
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

xhochy commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r175906985
 
 

 ##
 File path: java/adapter/jdbc/pom.xml
 ##
 @@ -0,0 +1,76 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd;>
+
+4.0.0
+org.apache.arrow.adapter.jdbc
+arrow-jdbc
+jar
+0.10.0-SNAPSHOT
+Arrow JDBC Adapter
+http://maven.apache.org
+
+
+
+
+org.apache.arrow
+arrow-memory
+0.9.0-SNAPSHOT
+
+
+
+org.apache.arrow
+arrow-vector
+0.9.0-SNAPSHOT
 
 Review comment:
   See above


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409958#comment-16409958
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

xhochy commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r175906418
 
 

 ##
 File path: dev/release/rat_exclude_files.txt
 ##
 @@ -74,3 +74,5 @@ c_glib/doc/reference/gtk-doc.make
 *.svg
 *.devhelp2
 *.scss
+*.yml
 
 Review comment:
   Don't exclude these files but add a license header to them


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16409959#comment-16409959
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

xhochy commented on a change in pull request #1759: ARROW-1780 - [WIP] JDBC 
Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759#discussion_r175906922
 
 

 ##
 File path: java/adapter/jdbc/pom.xml
 ##
 @@ -0,0 +1,76 @@
+
+
+
+http://maven.apache.org/POM/4.0.0; 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
+ xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 
http://maven.apache.org/maven-v4_0_0.xsd;>
+
+4.0.0
+org.apache.arrow.adapter.jdbc
+arrow-jdbc
+jar
+0.10.0-SNAPSHOT
+Arrow JDBC Adapter
+http://maven.apache.org
+
+
+
+
+org.apache.arrow
+arrow-memory
+0.9.0-SNAPSHOT
 
 Review comment:
   Use `${project.version}` here instead of `0.9.0-SNAPSHOT`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401077#comment-16401077
 ] 

ASF GitHub Bot commented on ARROW-1780:
---

atuldambalkar opened a new pull request #1759: ARROW-1780 - (WIP) JDBC Adapter 
to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759
 
 
   This code enhancement is for converting JDBC ResultSet Relational objects to 
Arrow columnar data Vector objects. Code is under director 
"java/adapter/jdbc/src/main".
   
   The API has following static methods in the 
   
   class org.apache.arrow.adapter.jdbc.JdbcToArrow -
   
   public static VectorSchemaRoot sqlToArrow(Connection connection, String 
query)
   public static ArrowDataFetcher jdbcArrowDataFetcher(Connection connection, 
String tableName) 
   
   Utility uses following data mapping to convert JDBC/SQL data types to Arrow 
data types -
   CHAR --> ArrowType.Utf8
   NCHAR--> ArrowType.Utf8
   VARCHAR --> ArrowType.Utf8
   NVARCHAR --> ArrowType.Utf8
   LONGVARCHAR --> ArrowType.Utf8
   LONGNVARCHAR --> ArrowType.Utf8
   NUMERIC --> ArrowType.Decimal(precision, scale)
   DECIMAL --> ArrowType.Decimal(precision, scale)
   BIT --> ArrowType.Bool
   TINYINT --> ArrowType.Int(8, signed)
   SMALLINT --> ArrowType.Int(16, signed)
   INTEGER --> ArrowType.Int(32, signed)
   BIGINT --> ArrowType.Int(64, signed)
   REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
   FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
   DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
   BINARY --> ArrowType.Binary
   VARBINARY --> ArrowType.Binary
   LONGVARBINARY --> ArrowType.Binary
   DATE --> ArrowType.Date(DateUnit.MILLISECOND)
   TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
   TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
   CLOB --> ArrowType.Utf8
   BLOB --> ArrowType.Binary
   
   JUnit test cases under java/adapter/jdbc/src/test. Test cases uses H2 
in-memory database. 
   
   I am still working on adding and automating additional test cases. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Assignee: Atul Dambalkar
>Priority: Major
>  Labels: pull-request-available
> Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

2018-02-19 Thread Atul Dambalkar (JIRA)

[ 
https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369612#comment-16369612
 ] 

Atul Dambalkar commented on ARROW-1780:
---

Based on above comments, I will update the API with necessary parameters. It 
will be more or less like pagination.

> JDBC Adapter for Apache Arrow
> -
>
> Key: ARROW-1780
> URL: https://issues.apache.org/jira/browse/ARROW-1780
> Project: Apache Arrow
>  Issue Type: New Feature
>Reporter: Atul Dambalkar
>Priority: Major
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data 
> over JDBC and get the JDBC objects converted to Arrow objects/structures. The 
> upstream utility can then work with Arrow objects/structures with usual 
> performance benefits. The utility will be very much similar to C++ 
> implementation of "Convert a vector of row-wise data into an Arrow table" as 
> described here - 
> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow 
> objects/structures. So from that perspective this will Read data from RDBMS, 
> If the utility can push Arrow objects to RDBMS is something need to be 
> discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >