Re: FW: [jira] [Updated] (ARROW-1780) JDBC Adapter for Apache Arrow
Thanks Wes. Sent from my Samsung Galaxy smartphone. Original message From: Wes McKinney <wesmck...@gmail.com> Date: 2/21/18 7:37 AM (GMT-08:00) To: dev@arrow.apache.org Subject: Re: FW: [jira] [Updated] (ARROW-1780) JDBC Adapter for Apache Arrow hi Atul -- I added you to the contributor role in JIRA and assigned the issue to you On Tue, Feb 20, 2018 at 11:20 PM, Atul Dambalkar <atul.dambal...@xoriant.com> wrote: > Hi Uwe, > > In terms of process, does this bug need to be assigned to me? I tried, but I > couldn't get it assigned to myself. May be you or someone from Arrow team can > do that? > > -Atul > -Original Message- > From: Uwe L. Korn (JIRA) [mailto:j...@apache.org] > Sent: Tuesday, February 20, 2018 12:29 PM > To: Atul Dambalkar <atul.dambal...@xoriant.com> > Subject: [jira] [Updated] (ARROW-1780) JDBC Adapter for Apache Arrow > > > [ > https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Uwe L. Korn updated ARROW-1780: > ------- > Fix Version/s: 0.10.0 > >> JDBC Adapter for Apache Arrow >> - >> >> Key: ARROW-1780 >> URL: https://issues.apache.org/jira/browse/ARROW-1780 >> Project: Apache Arrow >> Issue Type: New Feature >>Reporter: Atul Dambalkar >>Priority: Major >> Fix For: 0.10.0 >> >> >> At a high level the JDBC Adapter will allow upstream apps to query >> RDBMS data over JDBC and get the JDBC objects converted to Arrow >> objects/structures. The upstream utility can then work with Arrow >> objects/structures with usual performance benefits. The utility will >> be very much similar to C++ implementation of "Convert a vector of >> row-wise data into an Arrow table" as described here - >> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.htm >> l The utility will read data from RDBMS and covert the data into Arrow >> objects/structures. So from that perspective this will Read data from RDBMS, >> If the utility can push Arrow objects to RDBMS is something need to be >> discussed and will be out of scope for this utility for now. > > > > -- > This message was sent by Atlassian JIRA > (v7.6.3#76005)
Re: FW: [jira] [Updated] (ARROW-1780) JDBC Adapter for Apache Arrow
hi Atul -- I added you to the contributor role in JIRA and assigned the issue to you On Tue, Feb 20, 2018 at 11:20 PM, Atul Dambalkar <atul.dambal...@xoriant.com> wrote: > Hi Uwe, > > In terms of process, does this bug need to be assigned to me? I tried, but I > couldn't get it assigned to myself. May be you or someone from Arrow team can > do that? > > -Atul > -Original Message- > From: Uwe L. Korn (JIRA) [mailto:j...@apache.org] > Sent: Tuesday, February 20, 2018 12:29 PM > To: Atul Dambalkar <atul.dambal...@xoriant.com> > Subject: [jira] [Updated] (ARROW-1780) JDBC Adapter for Apache Arrow > > > [ > https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Uwe L. Korn updated ARROW-1780: > ------- > Fix Version/s: 0.10.0 > >> JDBC Adapter for Apache Arrow >> - >> >> Key: ARROW-1780 >> URL: https://issues.apache.org/jira/browse/ARROW-1780 >> Project: Apache Arrow >> Issue Type: New Feature >>Reporter: Atul Dambalkar >>Priority: Major >> Fix For: 0.10.0 >> >> >> At a high level the JDBC Adapter will allow upstream apps to query >> RDBMS data over JDBC and get the JDBC objects converted to Arrow >> objects/structures. The upstream utility can then work with Arrow >> objects/structures with usual performance benefits. The utility will >> be very much similar to C++ implementation of "Convert a vector of >> row-wise data into an Arrow table" as described here - >> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.htm >> l The utility will read data from RDBMS and covert the data into Arrow >> objects/structures. So from that perspective this will Read data from RDBMS, >> If the utility can push Arrow objects to RDBMS is something need to be >> discussed and will be out of scope for this utility for now. > > > > -- > This message was sent by Atlassian JIRA > (v7.6.3#76005)
FW: [jira] [Updated] (ARROW-1780) JDBC Adapter for Apache Arrow
Hi Uwe, In terms of process, does this bug need to be assigned to me? I tried, but I couldn't get it assigned to myself. May be you or someone from Arrow team can do that? -Atul -Original Message- From: Uwe L. Korn (JIRA) [mailto:j...@apache.org] Sent: Tuesday, February 20, 2018 12:29 PM To: Atul Dambalkar <atul.dambal...@xoriant.com> Subject: [jira] [Updated] (ARROW-1780) JDBC Adapter for Apache Arrow [ https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe L. Korn updated ARROW-1780: --- Fix Version/s: 0.10.0 > JDBC Adapter for Apache Arrow > - > > Key: ARROW-1780 > URL: https://issues.apache.org/jira/browse/ARROW-1780 > Project: Apache Arrow > Issue Type: New Feature >Reporter: Atul Dambalkar >Priority: Major > Fix For: 0.10.0 > > > At a high level the JDBC Adapter will allow upstream apps to query > RDBMS data over JDBC and get the JDBC objects converted to Arrow > objects/structures. The upstream utility can then work with Arrow > objects/structures with usual performance benefits. The utility will > be very much similar to C++ implementation of "Convert a vector of > row-wise data into an Arrow table" as described here - > https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.htm > l The utility will read data from RDBMS and covert the data into Arrow > objects/structures. So from that perspective this will Read data from RDBMS, > If the utility can push Arrow objects to RDBMS is something need to be > discussed and will be out of scope for this utility for now. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
RE: JDBC Adapter for Apache-Arrow
Hi Uwe, Sorry for late response on this thread. We have started some discussions internally. I wanted to know what help you would need specifically on the JDBC Adapter front, we would be happy to collaborate. At this time, we were mainly trying to model it around the C++ work that has gone in. Are there any particular use-cases/requirements you have in mind? -Atul -Original Message- From: Jacques Nadeau [mailto:jacq...@apache.org] Sent: Tuesday, January 09, 2018 7:41 PM To: dev@arrow.apache.org Subject: Re: JDBC Adapter for Apache-Arrow We have some stuff I Dremio that we've planned on open sourcing but haven't yet done so. We should try to get that out for others to consume. On Jan 7, 2018 11:49 AM, "Uwe L. Korn" <uw...@xhochy.com> wrote: > Has anyone made progress on the JDBC adapter yet? > > I recently came across a lot projects with good JDBC drivers but not > so good drivers in Python. Having an Arrow-JDBC adaptor would make > these query engines much more useful to the Python community. Being an > Arrow committer and one of the turbodbc authors, I have quite some > knowledge in this area but my Java is a bit rusty and I have never > dealt with JDBC, so I‘m looking for someone to collaborate on this feature. > > Also this might be my ultimate chance to also get contributing to the > Java part of Apache Arrow. > > Uwe > > > Am 07.11.2017 um 20:01 schrieb Julian Hyde <jh...@apache.org>: > > > > I have logged https://issues.apache.org/jira/browse/CALCITE-2040 (I > > logged it within Calcite because this makes more sense that this is > > an Arrow adapter within Calcite than a Calcite adapter within Arrow). > > > > Note the last paragraph about > > https://issues.apache.org/jira/browse/CALCITE-2025 and > > bioinformatics file formats. Readers for these formats would be > > useful extensions to Arrow regardless of whether the data was > > ultimately going to be queried using SQL. (Contributions welcome!) > > Calcite's bio adapter would build upon the Arrow readers in two > > respects: (1) to read metadata from these files (e.g. are there any > > extra fields?) and (2) to push down processing (filters, projects) into the > > reader. > > > > Julian > > > > > > On Tue, Nov 7, 2017 at 10:21 AM, Atul Dambalkar > > <atul.dambal...@xoriant.com> wrote: > >> Hi, > >> > >> Don' t mean to interrupt the current discussion threads. But, based > >> on > the discussions so far on the JDBC Adapter piece, are we in a position > to create a JIRA ticket for this as well as the other piece about > adding a direct Arrow objects creation support from JDBC drivers? If > yes, I can certainly go ahead and create JIRA for JDBC Adapter work. > >> > >> Julian, would you like to create the JIRA for the other item that > >> you > proposed. > >> > >> -Atul > >> > >> -Original Message- > >> From: Atul Dambalkar > >> Sent: Thursday, November 02, 2017 2:59 PM > >> To: dev@arrow.apache.org > >> Subject: RE: JDBC Adapter for Apache-Arrow > >> > >> I also like the approach of adding an interface and making it art > >> of > Arrow, so any specific JDBC driver can implement that interface to > directly expose Arrow objects without having to create JDBC objects in > the first place. One such implementation could be for Avatica itself > what Julian was suggesting earlier. > >> > >> -Original Message- > >> From: Julian Hyde [mailto:jh...@apache.org] > >> Sent: Tuesday, October 31, 2017 4:28 PM > >> To: dev@arrow.apache.org > >> Subject: Re: JDBC Adapter for Apache-Arrow > >> > >> Yeah, I agree, it should be an interface defined as part of Arrow. > >> Not > driver-specific. > >> > >>> On Oct 31, 2017, at 1:37 PM, Laurent Goujon <laur...@dremio.com> > wrote: > >>> > >>> I really like Julian's idea of unwrapping Arrow objects out of the > >>> JDBC ResultSet, but I wonder if the unwrap class has to be > >>> specific to the driver and if an interface can be designed to be > >>> used by multiple > drivers: > >>> for drivers based on Arrow, it means you could totally skip the > >>> serialization/deserialization from/to JDBC records. > >>> If such an interface exists, I would propose to add it to the > >>> Arrow project, with Arrow product/projects in charge of adding > >>> support for it in their own JDBC driver. > >>> > >>> Laurent > >>
Re: JDBC Adapter for Apache-Arrow
We have some stuff I Dremio that we've planned on open sourcing but haven't yet done so. We should try to get that out for others to consume. On Jan 7, 2018 11:49 AM, "Uwe L. Korn" <uw...@xhochy.com> wrote: > Has anyone made progress on the JDBC adapter yet? > > I recently came across a lot projects with good JDBC drivers but not so > good drivers in Python. Having an Arrow-JDBC adaptor would make these query > engines much more useful to the Python community. Being an Arrow committer > and one of the turbodbc authors, I have quite some knowledge in this area > but my Java is a bit rusty and I have never dealt with JDBC, so I‘m looking > for someone to collaborate on this feature. > > Also this might be my ultimate chance to also get contributing to the Java > part of Apache Arrow. > > Uwe > > > Am 07.11.2017 um 20:01 schrieb Julian Hyde <jh...@apache.org>: > > > > I have logged https://issues.apache.org/jira/browse/CALCITE-2040 (I > > logged it within Calcite because this makes more sense that this is an > > Arrow adapter within Calcite than a Calcite adapter within Arrow). > > > > Note the last paragraph about > > https://issues.apache.org/jira/browse/CALCITE-2025 and bioinformatics > > file formats. Readers for these formats would be useful extensions to > > Arrow regardless of whether the data was ultimately going to be > > queried using SQL. (Contributions welcome!) Calcite's bio adapter > > would build upon the Arrow readers in two respects: (1) to read > > metadata from these files (e.g. are there any extra fields?) and (2) > > to push down processing (filters, projects) into the reader. > > > > Julian > > > > > > On Tue, Nov 7, 2017 at 10:21 AM, Atul Dambalkar > > <atul.dambal...@xoriant.com> wrote: > >> Hi, > >> > >> Don' t mean to interrupt the current discussion threads. But, based on > the discussions so far on the JDBC Adapter piece, are we in a position to > create a JIRA ticket for this as well as the other piece about adding a > direct Arrow objects creation support from JDBC drivers? If yes, I can > certainly go ahead and create JIRA for JDBC Adapter work. > >> > >> Julian, would you like to create the JIRA for the other item that you > proposed. > >> > >> -Atul > >> > >> -Original Message- > >> From: Atul Dambalkar > >> Sent: Thursday, November 02, 2017 2:59 PM > >> To: dev@arrow.apache.org > >> Subject: RE: JDBC Adapter for Apache-Arrow > >> > >> I also like the approach of adding an interface and making it art of > Arrow, so any specific JDBC driver can implement that interface to directly > expose Arrow objects without having to create JDBC objects in the first > place. One such implementation could be for Avatica itself what Julian was > suggesting earlier. > >> > >> -Original Message- > >> From: Julian Hyde [mailto:jh...@apache.org] > >> Sent: Tuesday, October 31, 2017 4:28 PM > >> To: dev@arrow.apache.org > >> Subject: Re: JDBC Adapter for Apache-Arrow > >> > >> Yeah, I agree, it should be an interface defined as part of Arrow. Not > driver-specific. > >> > >>> On Oct 31, 2017, at 1:37 PM, Laurent Goujon <laur...@dremio.com> > wrote: > >>> > >>> I really like Julian's idea of unwrapping Arrow objects out of the > >>> JDBC ResultSet, but I wonder if the unwrap class has to be specific to > >>> the driver and if an interface can be designed to be used by multiple > drivers: > >>> for drivers based on Arrow, it means you could totally skip the > >>> serialization/deserialization from/to JDBC records. > >>> If such an interface exists, I would propose to add it to the Arrow > >>> project, with Arrow product/projects in charge of adding support for > >>> it in their own JDBC driver. > >>> > >>> Laurent > >>> > >>> On Tue, Oct 31, 2017 at 1:18 PM, Atul Dambalkar > >>> <atul.dambal...@xoriant.com> > >>> wrote: > >>> > >>>> Thanks for your thoughts Julian. I think, adding support for Arrow > >>>> objects for Avatica Remote Driver (AvaticaToArrowConverter) can be > >>>> certainly taken up as another activity. And you are right, we will > >>>> have to look at specific JDBC driver to really optimize it > individually. > >>>> > >>>> I would be curious if there are any further inputs/comments from > >>>> other Dev fol
Re: JDBC Adapter for Apache-Arrow
Has anyone made progress on the JDBC adapter yet? I recently came across a lot projects with good JDBC drivers but not so good drivers in Python. Having an Arrow-JDBC adaptor would make these query engines much more useful to the Python community. Being an Arrow committer and one of the turbodbc authors, I have quite some knowledge in this area but my Java is a bit rusty and I have never dealt with JDBC, so I‘m looking for someone to collaborate on this feature. Also this might be my ultimate chance to also get contributing to the Java part of Apache Arrow. Uwe > Am 07.11.2017 um 20:01 schrieb Julian Hyde <jh...@apache.org>: > > I have logged https://issues.apache.org/jira/browse/CALCITE-2040 (I > logged it within Calcite because this makes more sense that this is an > Arrow adapter within Calcite than a Calcite adapter within Arrow). > > Note the last paragraph about > https://issues.apache.org/jira/browse/CALCITE-2025 and bioinformatics > file formats. Readers for these formats would be useful extensions to > Arrow regardless of whether the data was ultimately going to be > queried using SQL. (Contributions welcome!) Calcite's bio adapter > would build upon the Arrow readers in two respects: (1) to read > metadata from these files (e.g. are there any extra fields?) and (2) > to push down processing (filters, projects) into the reader. > > Julian > > > On Tue, Nov 7, 2017 at 10:21 AM, Atul Dambalkar > <atul.dambal...@xoriant.com> wrote: >> Hi, >> >> Don' t mean to interrupt the current discussion threads. But, based on the >> discussions so far on the JDBC Adapter piece, are we in a position to create >> a JIRA ticket for this as well as the other piece about adding a direct >> Arrow objects creation support from JDBC drivers? If yes, I can certainly go >> ahead and create JIRA for JDBC Adapter work. >> >> Julian, would you like to create the JIRA for the other item that you >> proposed. >> >> -Atul >> >> -Original Message- >> From: Atul Dambalkar >> Sent: Thursday, November 02, 2017 2:59 PM >> To: dev@arrow.apache.org >> Subject: RE: JDBC Adapter for Apache-Arrow >> >> I also like the approach of adding an interface and making it art of Arrow, >> so any specific JDBC driver can implement that interface to directly expose >> Arrow objects without having to create JDBC objects in the first place. One >> such implementation could be for Avatica itself what Julian was suggesting >> earlier. >> >> -Original Message- >> From: Julian Hyde [mailto:jh...@apache.org] >> Sent: Tuesday, October 31, 2017 4:28 PM >> To: dev@arrow.apache.org >> Subject: Re: JDBC Adapter for Apache-Arrow >> >> Yeah, I agree, it should be an interface defined as part of Arrow. Not >> driver-specific. >> >>> On Oct 31, 2017, at 1:37 PM, Laurent Goujon <laur...@dremio.com> wrote: >>> >>> I really like Julian's idea of unwrapping Arrow objects out of the >>> JDBC ResultSet, but I wonder if the unwrap class has to be specific to >>> the driver and if an interface can be designed to be used by multiple >>> drivers: >>> for drivers based on Arrow, it means you could totally skip the >>> serialization/deserialization from/to JDBC records. >>> If such an interface exists, I would propose to add it to the Arrow >>> project, with Arrow product/projects in charge of adding support for >>> it in their own JDBC driver. >>> >>> Laurent >>> >>> On Tue, Oct 31, 2017 at 1:18 PM, Atul Dambalkar >>> <atul.dambal...@xoriant.com> >>> wrote: >>> >>>> Thanks for your thoughts Julian. I think, adding support for Arrow >>>> objects for Avatica Remote Driver (AvaticaToArrowConverter) can be >>>> certainly taken up as another activity. And you are right, we will >>>> have to look at specific JDBC driver to really optimize it individually. >>>> >>>> I would be curious if there are any further inputs/comments from >>>> other Dev folks, on the JDBC adapter aspect. >>>> >>>> -Atul >>>> >>>> -Original Message- >>>> From: Julian Hyde [mailto:jh...@apache.org] >>>> Sent: Tuesday, October 31, 2017 11:12 AM >>>> To: dev@arrow.apache.org >>>> Subject: Re: JDBC Adapter for Apache-Arrow >>>> >>>> Sorry I didn’t read your email thoroughly enough. I was talking about >>>> the inverse (JDBC reading from Arrow) whereas you are talking about >&g
[jira] [Created] (ARROW-1780) JDBC Adapter for Apache Arrow
Atul Dambalkar created ARROW-1780: - Summary: JDBC Adapter for Apache Arrow Key: ARROW-1780 URL: https://issues.apache.org/jira/browse/ARROW-1780 Project: Apache Arrow Issue Type: New Feature Reporter: Atul Dambalkar At a high level the JDBC Adapter will allow upstream apps to query RDBMS data over JDBC and get the JDBC objects converted to Arrow objects/structures. The upstream utility can then work with Arrow objects/structures with usual performance benefits. The utility will be very much similar to C++ implementation of "Convert a vector of row-wise data into an Arrow table" as described here - https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html The utility will read data from RDBMS and covert the data into Arrow objects/structures. So from that perspective this will Read data from RDBMS, If the utility can push Arrow objects to RDBMS is something need to be discussed and will be out of scope for this utility for now. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
Re: JDBC Adapter for Apache-Arrow
I have logged https://issues.apache.org/jira/browse/CALCITE-2040 (I logged it within Calcite because this makes more sense that this is an Arrow adapter within Calcite than a Calcite adapter within Arrow). Note the last paragraph about https://issues.apache.org/jira/browse/CALCITE-2025 and bioinformatics file formats. Readers for these formats would be useful extensions to Arrow regardless of whether the data was ultimately going to be queried using SQL. (Contributions welcome!) Calcite's bio adapter would build upon the Arrow readers in two respects: (1) to read metadata from these files (e.g. are there any extra fields?) and (2) to push down processing (filters, projects) into the reader. Julian On Tue, Nov 7, 2017 at 10:21 AM, Atul Dambalkar <atul.dambal...@xoriant.com> wrote: > Hi, > > Don' t mean to interrupt the current discussion threads. But, based on the > discussions so far on the JDBC Adapter piece, are we in a position to create > a JIRA ticket for this as well as the other piece about adding a direct Arrow > objects creation support from JDBC drivers? If yes, I can certainly go ahead > and create JIRA for JDBC Adapter work. > > Julian, would you like to create the JIRA for the other item that you > proposed. > > -Atul > > -Original Message- > From: Atul Dambalkar > Sent: Thursday, November 02, 2017 2:59 PM > To: dev@arrow.apache.org > Subject: RE: JDBC Adapter for Apache-Arrow > > I also like the approach of adding an interface and making it art of Arrow, > so any specific JDBC driver can implement that interface to directly expose > Arrow objects without having to create JDBC objects in the first place. One > such implementation could be for Avatica itself what Julian was suggesting > earlier. > > -Original Message- > From: Julian Hyde [mailto:jh...@apache.org] > Sent: Tuesday, October 31, 2017 4:28 PM > To: dev@arrow.apache.org > Subject: Re: JDBC Adapter for Apache-Arrow > > Yeah, I agree, it should be an interface defined as part of Arrow. Not > driver-specific. > >> On Oct 31, 2017, at 1:37 PM, Laurent Goujon <laur...@dremio.com> wrote: >> >> I really like Julian's idea of unwrapping Arrow objects out of the >> JDBC ResultSet, but I wonder if the unwrap class has to be specific to >> the driver and if an interface can be designed to be used by multiple >> drivers: >> for drivers based on Arrow, it means you could totally skip the >> serialization/deserialization from/to JDBC records. >> If such an interface exists, I would propose to add it to the Arrow >> project, with Arrow product/projects in charge of adding support for >> it in their own JDBC driver. >> >> Laurent >> >> On Tue, Oct 31, 2017 at 1:18 PM, Atul Dambalkar >> <atul.dambal...@xoriant.com> >> wrote: >> >>> Thanks for your thoughts Julian. I think, adding support for Arrow >>> objects for Avatica Remote Driver (AvaticaToArrowConverter) can be >>> certainly taken up as another activity. And you are right, we will >>> have to look at specific JDBC driver to really optimize it individually. >>> >>> I would be curious if there are any further inputs/comments from >>> other Dev folks, on the JDBC adapter aspect. >>> >>> -Atul >>> >>> -Original Message- >>> From: Julian Hyde [mailto:jh...@apache.org] >>> Sent: Tuesday, October 31, 2017 11:12 AM >>> To: dev@arrow.apache.org >>> Subject: Re: JDBC Adapter for Apache-Arrow >>> >>> Sorry I didn’t read your email thoroughly enough. I was talking about >>> the inverse (JDBC reading from Arrow) whereas you are talking about >>> Arrow reading from JDBC. Your proposal makes perfect sense. >>> >>> JDBC is quite a chatty interface (a call for every column of every >>> row, plus an occasional call to find out whether values are null, and >>> objects such as strings and timestamps become a Java heap object) so >>> for specific JDBC drivers it may be possible to optimize. For >>> example, the Avatica remove driver receives row sets in an RPC >>> response in protobuf format. It may be useful if the JDBC driver were >>> able to expose a direct path from protobuf to Arrow. >>> "ResultSet.unwrap(AvaticaToArrowConverter.class)” >>> might be one way to achieve this. >>> >>> Julian >>> >>> >>> >>> >>>> On Oct 31, 2017, at 10:41 AM, Atul Dambalkar >>>> <atul.dambal...@xoriant.com> >>> wrote: >>>> >>>> Hi Julian, >>>> >>>>
RE: JDBC Adapter for Apache-Arrow
Hi, Don' t mean to interrupt the current discussion threads. But, based on the discussions so far on the JDBC Adapter piece, are we in a position to create a JIRA ticket for this as well as the other piece about adding a direct Arrow objects creation support from JDBC drivers? If yes, I can certainly go ahead and create JIRA for JDBC Adapter work. Julian, would you like to create the JIRA for the other item that you proposed. -Atul -Original Message- From: Atul Dambalkar Sent: Thursday, November 02, 2017 2:59 PM To: dev@arrow.apache.org Subject: RE: JDBC Adapter for Apache-Arrow I also like the approach of adding an interface and making it art of Arrow, so any specific JDBC driver can implement that interface to directly expose Arrow objects without having to create JDBC objects in the first place. One such implementation could be for Avatica itself what Julian was suggesting earlier. -Original Message- From: Julian Hyde [mailto:jh...@apache.org] Sent: Tuesday, October 31, 2017 4:28 PM To: dev@arrow.apache.org Subject: Re: JDBC Adapter for Apache-Arrow Yeah, I agree, it should be an interface defined as part of Arrow. Not driver-specific. > On Oct 31, 2017, at 1:37 PM, Laurent Goujon <laur...@dremio.com> wrote: > > I really like Julian's idea of unwrapping Arrow objects out of the > JDBC ResultSet, but I wonder if the unwrap class has to be specific to > the driver and if an interface can be designed to be used by multiple drivers: > for drivers based on Arrow, it means you could totally skip the > serialization/deserialization from/to JDBC records. > If such an interface exists, I would propose to add it to the Arrow > project, with Arrow product/projects in charge of adding support for > it in their own JDBC driver. > > Laurent > > On Tue, Oct 31, 2017 at 1:18 PM, Atul Dambalkar > <atul.dambal...@xoriant.com> > wrote: > >> Thanks for your thoughts Julian. I think, adding support for Arrow >> objects for Avatica Remote Driver (AvaticaToArrowConverter) can be >> certainly taken up as another activity. And you are right, we will >> have to look at specific JDBC driver to really optimize it individually. >> >> I would be curious if there are any further inputs/comments from >> other Dev folks, on the JDBC adapter aspect. >> >> -Atul >> >> -Original Message- >> From: Julian Hyde [mailto:jh...@apache.org] >> Sent: Tuesday, October 31, 2017 11:12 AM >> To: dev@arrow.apache.org >> Subject: Re: JDBC Adapter for Apache-Arrow >> >> Sorry I didn’t read your email thoroughly enough. I was talking about >> the inverse (JDBC reading from Arrow) whereas you are talking about >> Arrow reading from JDBC. Your proposal makes perfect sense. >> >> JDBC is quite a chatty interface (a call for every column of every >> row, plus an occasional call to find out whether values are null, and >> objects such as strings and timestamps become a Java heap object) so >> for specific JDBC drivers it may be possible to optimize. For >> example, the Avatica remove driver receives row sets in an RPC >> response in protobuf format. It may be useful if the JDBC driver were >> able to expose a direct path from protobuf to Arrow. >> "ResultSet.unwrap(AvaticaToArrowConverter.class)” >> might be one way to achieve this. >> >> Julian >> >> >> >> >>> On Oct 31, 2017, at 10:41 AM, Atul Dambalkar >>> <atul.dambal...@xoriant.com> >> wrote: >>> >>> Hi Julian, >>> >>> Thanks for your response. If I understand correctly (looking at >>> other >> adapters), Calcite-Arrow adapter would provide SQL front end for >> in-memory Arrow data objects/structures. So from that perspective, >> are you suggesting building the Calcite-Arrow adapter? >>> >>> In this case, what we are saying is to provide a mechanism for >>> upstream >> apps to be able to get/create Arrow objects/structures from a >> relational database. This would also mean converting row like data >> from a SQL Database to columnar Arrow data structures. The utility >> may be, can make use of JDBC's MetaData features to figure out the >> underlying DB schema and define Arrow columnar schema. Also >> underlying database in this case would be any relational DB and hence >> would be persisted to the disk, but the Arrow objects being in-memory can be >> ephemeral. >>> >>> Please correct me if I am missing anything. >>> >>> -Atul >>> >>> -Original Message- >>> From: Julian Hyde [mailto:jhyde.apa...@gmail.com]
RE: JDBC Adapter for Apache-Arrow
I also like the approach of adding an interface and making it art of Arrow, so any specific JDBC driver can implement that interface to directly expose Arrow objects without having to create JDBC objects in the first place. One such implementation could be for Avatica itself what Julian was suggesting earlier. -Original Message- From: Julian Hyde [mailto:jh...@apache.org] Sent: Tuesday, October 31, 2017 4:28 PM To: dev@arrow.apache.org Subject: Re: JDBC Adapter for Apache-Arrow Yeah, I agree, it should be an interface defined as part of Arrow. Not driver-specific. > On Oct 31, 2017, at 1:37 PM, Laurent Goujon <laur...@dremio.com> wrote: > > I really like Julian's idea of unwrapping Arrow objects out of the > JDBC ResultSet, but I wonder if the unwrap class has to be specific to > the driver and if an interface can be designed to be used by multiple drivers: > for drivers based on Arrow, it means you could totally skip the > serialization/deserialization from/to JDBC records. > If such an interface exists, I would propose to add it to the Arrow > project, with Arrow product/projects in charge of adding support for > it in their own JDBC driver. > > Laurent > > On Tue, Oct 31, 2017 at 1:18 PM, Atul Dambalkar > <atul.dambal...@xoriant.com> > wrote: > >> Thanks for your thoughts Julian. I think, adding support for Arrow >> objects for Avatica Remote Driver (AvaticaToArrowConverter) can be >> certainly taken up as another activity. And you are right, we will >> have to look at specific JDBC driver to really optimize it individually. >> >> I would be curious if there are any further inputs/comments from >> other Dev folks, on the JDBC adapter aspect. >> >> -Atul >> >> -Original Message- >> From: Julian Hyde [mailto:jh...@apache.org] >> Sent: Tuesday, October 31, 2017 11:12 AM >> To: dev@arrow.apache.org >> Subject: Re: JDBC Adapter for Apache-Arrow >> >> Sorry I didn’t read your email thoroughly enough. I was talking about >> the inverse (JDBC reading from Arrow) whereas you are talking about >> Arrow reading from JDBC. Your proposal makes perfect sense. >> >> JDBC is quite a chatty interface (a call for every column of every >> row, plus an occasional call to find out whether values are null, and >> objects such as strings and timestamps become a Java heap object) so >> for specific JDBC drivers it may be possible to optimize. For >> example, the Avatica remove driver receives row sets in an RPC >> response in protobuf format. It may be useful if the JDBC driver were >> able to expose a direct path from protobuf to Arrow. >> "ResultSet.unwrap(AvaticaToArrowConverter.class)” >> might be one way to achieve this. >> >> Julian >> >> >> >> >>> On Oct 31, 2017, at 10:41 AM, Atul Dambalkar >>> <atul.dambal...@xoriant.com> >> wrote: >>> >>> Hi Julian, >>> >>> Thanks for your response. If I understand correctly (looking at >>> other >> adapters), Calcite-Arrow adapter would provide SQL front end for >> in-memory Arrow data objects/structures. So from that perspective, >> are you suggesting building the Calcite-Arrow adapter? >>> >>> In this case, what we are saying is to provide a mechanism for >>> upstream >> apps to be able to get/create Arrow objects/structures from a >> relational database. This would also mean converting row like data >> from a SQL Database to columnar Arrow data structures. The utility >> may be, can make use of JDBC's MetaData features to figure out the >> underlying DB schema and define Arrow columnar schema. Also >> underlying database in this case would be any relational DB and hence >> would be persisted to the disk, but the Arrow objects being in-memory can be >> ephemeral. >>> >>> Please correct me if I am missing anything. >>> >>> -Atul >>> >>> -Original Message- >>> From: Julian Hyde [mailto:jhyde.apa...@gmail.com] >>> Sent: Monday, October 30, 2017 7:50 PM >>> To: dev@arrow.apache.org >>> Subject: Re: JDBC Adapter for Apache-Arrow >>> >>> How about writing an Arrow adapter for Calcite? I think it amounts >>> to >> the same thing - you would inherit Calcite’s SQL parser and Avatica >> JDBC stack. >>> >>> Would this database be ephemeral (i.e. would the data go away when >>> you >> close the connection)? If not, how would you know where to load the >> data from? >>> >>> Juli
Re: JDBC Adapter for Apache-Arrow
http://lmgtfy.com/?q=unsubscribe+apache+arrow <http://lmgtfy.com/?q=unsubscribe+apache+arrow> > On Oct 31, 2017, at 5:20 PM, 丁锦祥 <vence...@gmail.com> wrote: > > unsubscribe > > On Tue, Oct 31, 2017 at 4:28 PM, Julian Hyde <jh...@apache.org> wrote: > >> Yeah, I agree, it should be an interface defined as part of Arrow. Not >> driver-specific. >> >>> On Oct 31, 2017, at 1:37 PM, Laurent Goujon <laur...@dremio.com> wrote: >>> >>> I really like Julian's idea of unwrapping Arrow objects out of the JDBC >>> ResultSet, but I wonder if the unwrap class has to be specific to the >>> driver and if an interface can be designed to be used by multiple >> drivers: >>> for drivers based on Arrow, it means you could totally skip the >>> serialization/deserialization from/to JDBC records. >>> If such an interface exists, I would propose to add it to the Arrow >>> project, with Arrow product/projects in charge of adding support for it >> in >>> their own JDBC driver. >>> >>> Laurent >>> >>> On Tue, Oct 31, 2017 at 1:18 PM, Atul Dambalkar < >> atul.dambal...@xoriant.com> >>> wrote: >>> >>>> Thanks for your thoughts Julian. I think, adding support for Arrow >> objects >>>> for Avatica Remote Driver (AvaticaToArrowConverter) can be certainly >> taken >>>> up as another activity. And you are right, we will have to look at >> specific >>>> JDBC driver to really optimize it individually. >>>> >>>> I would be curious if there are any further inputs/comments from other >> Dev >>>> folks, on the JDBC adapter aspect. >>>> >>>> -Atul >>>> >>>> -Original Message- >>>> From: Julian Hyde [mailto:jh...@apache.org] >>>> Sent: Tuesday, October 31, 2017 11:12 AM >>>> To: dev@arrow.apache.org >>>> Subject: Re: JDBC Adapter for Apache-Arrow >>>> >>>> Sorry I didn’t read your email thoroughly enough. I was talking about >> the >>>> inverse (JDBC reading from Arrow) whereas you are talking about Arrow >>>> reading from JDBC. Your proposal makes perfect sense. >>>> >>>> JDBC is quite a chatty interface (a call for every column of every row, >>>> plus an occasional call to find out whether values are null, and objects >>>> such as strings and timestamps become a Java heap object) so for >> specific >>>> JDBC drivers it may be possible to optimize. For example, the Avatica >>>> remove driver receives row sets in an RPC response in protobuf format. >> It >>>> may be useful if the JDBC driver were able to expose a direct path from >>>> protobuf to Arrow. "ResultSet.unwrap(AvaticaToArrowConverter.class)” >>>> might be one way to achieve this. >>>> >>>> Julian >>>> >>>> >>>> >>>> >>>>> On Oct 31, 2017, at 10:41 AM, Atul Dambalkar < >> atul.dambal...@xoriant.com> >>>> wrote: >>>>> >>>>> Hi Julian, >>>>> >>>>> Thanks for your response. If I understand correctly (looking at other >>>> adapters), Calcite-Arrow adapter would provide SQL front end for >> in-memory >>>> Arrow data objects/structures. So from that perspective, are you >> suggesting >>>> building the Calcite-Arrow adapter? >>>>> >>>>> In this case, what we are saying is to provide a mechanism for upstream >>>> apps to be able to get/create Arrow objects/structures from a relational >>>> database. This would also mean converting row like data from a SQL >> Database >>>> to columnar Arrow data structures. The utility may be, can make use of >>>> JDBC's MetaData features to figure out the underlying DB schema and >> define >>>> Arrow columnar schema. Also underlying database in this case would be >> any >>>> relational DB and hence would be persisted to the disk, but the Arrow >>>> objects being in-memory can be ephemeral. >>>>> >>>>> Please correct me if I am missing anything. >>>>> >>>>> -Atul >>>>> >>>>> -Original Message- >>>>> From: Julian Hyde [mailto:jhyde.apa...@gmail.com] >>>>> Sent: Monday, October 30, 2017 7:50 PM >>>>> To: dev@arrow.apache.org >>>>>
Re: JDBC Adapter for Apache-Arrow
I really like Julian's idea of unwrapping Arrow objects out of the JDBC ResultSet, but I wonder if the unwrap class has to be specific to the driver and if an interface can be designed to be used by multiple drivers: for drivers based on Arrow, it means you could totally skip the serialization/deserialization from/to JDBC records. If such an interface exists, I would propose to add it to the Arrow project, with Arrow product/projects in charge of adding support for it in their own JDBC driver. Laurent On Tue, Oct 31, 2017 at 1:18 PM, Atul Dambalkar <atul.dambal...@xoriant.com> wrote: > Thanks for your thoughts Julian. I think, adding support for Arrow objects > for Avatica Remote Driver (AvaticaToArrowConverter) can be certainly taken > up as another activity. And you are right, we will have to look at specific > JDBC driver to really optimize it individually. > > I would be curious if there are any further inputs/comments from other Dev > folks, on the JDBC adapter aspect. > > -Atul > > -Original Message- > From: Julian Hyde [mailto:jh...@apache.org] > Sent: Tuesday, October 31, 2017 11:12 AM > To: dev@arrow.apache.org > Subject: Re: JDBC Adapter for Apache-Arrow > > Sorry I didn’t read your email thoroughly enough. I was talking about the > inverse (JDBC reading from Arrow) whereas you are talking about Arrow > reading from JDBC. Your proposal makes perfect sense. > > JDBC is quite a chatty interface (a call for every column of every row, > plus an occasional call to find out whether values are null, and objects > such as strings and timestamps become a Java heap object) so for specific > JDBC drivers it may be possible to optimize. For example, the Avatica > remove driver receives row sets in an RPC response in protobuf format. It > may be useful if the JDBC driver were able to expose a direct path from > protobuf to Arrow. "ResultSet.unwrap(AvaticaToArrowConverter.class)” > might be one way to achieve this. > > Julian > > > > > > On Oct 31, 2017, at 10:41 AM, Atul Dambalkar <atul.dambal...@xoriant.com> > wrote: > > > > Hi Julian, > > > > Thanks for your response. If I understand correctly (looking at other > adapters), Calcite-Arrow adapter would provide SQL front end for in-memory > Arrow data objects/structures. So from that perspective, are you suggesting > building the Calcite-Arrow adapter? > > > > In this case, what we are saying is to provide a mechanism for upstream > apps to be able to get/create Arrow objects/structures from a relational > database. This would also mean converting row like data from a SQL Database > to columnar Arrow data structures. The utility may be, can make use of > JDBC's MetaData features to figure out the underlying DB schema and define > Arrow columnar schema. Also underlying database in this case would be any > relational DB and hence would be persisted to the disk, but the Arrow > objects being in-memory can be ephemeral. > > > > Please correct me if I am missing anything. > > > > -Atul > > > > -Original Message- > > From: Julian Hyde [mailto:jhyde.apa...@gmail.com] > > Sent: Monday, October 30, 2017 7:50 PM > > To: dev@arrow.apache.org > > Subject: Re: JDBC Adapter for Apache-Arrow > > > > How about writing an Arrow adapter for Calcite? I think it amounts to > the same thing - you would inherit Calcite’s SQL parser and Avatica JDBC > stack. > > > > Would this database be ephemeral (i.e. would the data go away when you > close the connection)? If not, how would you know where to load the data > from? > > > > Julian > > > >> On Oct 30, 2017, at 6:17 PM, Atul Dambalkar <atul.dambal...@xoriant.com> > wrote: > >> > >> Hi all, > >> > >> I wanted to open up a conversation here regarding developing a > Java-based JDBC Adapter for Apache Arrow. I have had a preliminary > discussion with Wes McKinney and Siddharth Teotia on this a couple weeks > earlier. > >> > >> Basically at a high level (over-simplified) this adapter/API will allow > upstream apps to query RDBMS data over JDBC and get the JDBC objects > converted to Arrow in-memory (JVM) objects/structures. The upstream utility > can then work with Arrow objects/structures with usual performance > benefits. The utility will be very much similar to C++ implementation of > "Convert a vector of row-wise data into an Arrow table" as described here - > https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html. > >> > >> How useful this adapter would be and which other Apache projects would > benefit from this? Based on the usability we can open a JIRA for this > activity and start looking into the implementation details. > >> > >> Regards, > >> -Atul Dambalkar > >> > >> > >
RE: JDBC Adapter for Apache-Arrow
Thanks for your thoughts Julian. I think, adding support for Arrow objects for Avatica Remote Driver (AvaticaToArrowConverter) can be certainly taken up as another activity. And you are right, we will have to look at specific JDBC driver to really optimize it individually. I would be curious if there are any further inputs/comments from other Dev folks, on the JDBC adapter aspect. -Atul -Original Message- From: Julian Hyde [mailto:jh...@apache.org] Sent: Tuesday, October 31, 2017 11:12 AM To: dev@arrow.apache.org Subject: Re: JDBC Adapter for Apache-Arrow Sorry I didn’t read your email thoroughly enough. I was talking about the inverse (JDBC reading from Arrow) whereas you are talking about Arrow reading from JDBC. Your proposal makes perfect sense. JDBC is quite a chatty interface (a call for every column of every row, plus an occasional call to find out whether values are null, and objects such as strings and timestamps become a Java heap object) so for specific JDBC drivers it may be possible to optimize. For example, the Avatica remove driver receives row sets in an RPC response in protobuf format. It may be useful if the JDBC driver were able to expose a direct path from protobuf to Arrow. "ResultSet.unwrap(AvaticaToArrowConverter.class)” might be one way to achieve this. Julian > On Oct 31, 2017, at 10:41 AM, Atul Dambalkar <atul.dambal...@xoriant.com> > wrote: > > Hi Julian, > > Thanks for your response. If I understand correctly (looking at other > adapters), Calcite-Arrow adapter would provide SQL front end for in-memory > Arrow data objects/structures. So from that perspective, are you suggesting > building the Calcite-Arrow adapter? > > In this case, what we are saying is to provide a mechanism for upstream apps > to be able to get/create Arrow objects/structures from a relational database. > This would also mean converting row like data from a SQL Database to columnar > Arrow data structures. The utility may be, can make use of JDBC's MetaData > features to figure out the underlying DB schema and define Arrow columnar > schema. Also underlying database in this case would be any relational DB and > hence would be persisted to the disk, but the Arrow objects being in-memory > can be ephemeral. > > Please correct me if I am missing anything. > > -Atul > > -Original Message- > From: Julian Hyde [mailto:jhyde.apa...@gmail.com] > Sent: Monday, October 30, 2017 7:50 PM > To: dev@arrow.apache.org > Subject: Re: JDBC Adapter for Apache-Arrow > > How about writing an Arrow adapter for Calcite? I think it amounts to the > same thing - you would inherit Calcite’s SQL parser and Avatica JDBC stack. > > Would this database be ephemeral (i.e. would the data go away when you close > the connection)? If not, how would you know where to load the data from? > > Julian > >> On Oct 30, 2017, at 6:17 PM, Atul Dambalkar <atul.dambal...@xoriant.com> >> wrote: >> >> Hi all, >> >> I wanted to open up a conversation here regarding developing a Java-based >> JDBC Adapter for Apache Arrow. I have had a preliminary discussion with Wes >> McKinney and Siddharth Teotia on this a couple weeks earlier. >> >> Basically at a high level (over-simplified) this adapter/API will allow >> upstream apps to query RDBMS data over JDBC and get the JDBC objects >> converted to Arrow in-memory (JVM) objects/structures. The upstream utility >> can then work with Arrow objects/structures with usual performance benefits. >> The utility will be very much similar to C++ implementation of "Convert a >> vector of row-wise data into an Arrow table" as described here - >> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html. >> >> How useful this adapter would be and which other Apache projects would >> benefit from this? Based on the usability we can open a JIRA for this >> activity and start looking into the implementation details. >> >> Regards, >> -Atul Dambalkar >> >>
Re: JDBC Adapter for Apache-Arrow
Sorry I didn’t read your email thoroughly enough. I was talking about the inverse (JDBC reading from Arrow) whereas you are talking about Arrow reading from JDBC. Your proposal makes perfect sense. JDBC is quite a chatty interface (a call for every column of every row, plus an occasional call to find out whether values are null, and objects such as strings and timestamps become a Java heap object) so for specific JDBC drivers it may be possible to optimize. For example, the Avatica remove driver receives row sets in an RPC response in protobuf format. It may be useful if the JDBC driver were able to expose a direct path from protobuf to Arrow. "ResultSet.unwrap(AvaticaToArrowConverter.class)” might be one way to achieve this. Julian > On Oct 31, 2017, at 10:41 AM, Atul Dambalkar <atul.dambal...@xoriant.com> > wrote: > > Hi Julian, > > Thanks for your response. If I understand correctly (looking at other > adapters), Calcite-Arrow adapter would provide SQL front end for in-memory > Arrow data objects/structures. So from that perspective, are you suggesting > building the Calcite-Arrow adapter? > > In this case, what we are saying is to provide a mechanism for upstream apps > to be able to get/create Arrow objects/structures from a relational database. > This would also mean converting row like data from a SQL Database to columnar > Arrow data structures. The utility may be, can make use of JDBC's MetaData > features to figure out the underlying DB schema and define Arrow columnar > schema. Also underlying database in this case would be any relational DB and > hence would be persisted to the disk, but the Arrow objects being in-memory > can be ephemeral. > > Please correct me if I am missing anything. > > -Atul > > -Original Message- > From: Julian Hyde [mailto:jhyde.apa...@gmail.com] > Sent: Monday, October 30, 2017 7:50 PM > To: dev@arrow.apache.org > Subject: Re: JDBC Adapter for Apache-Arrow > > How about writing an Arrow adapter for Calcite? I think it amounts to the > same thing - you would inherit Calcite’s SQL parser and Avatica JDBC stack. > > Would this database be ephemeral (i.e. would the data go away when you close > the connection)? If not, how would you know where to load the data from? > > Julian > >> On Oct 30, 2017, at 6:17 PM, Atul Dambalkar <atul.dambal...@xoriant.com> >> wrote: >> >> Hi all, >> >> I wanted to open up a conversation here regarding developing a Java-based >> JDBC Adapter for Apache Arrow. I have had a preliminary discussion with Wes >> McKinney and Siddharth Teotia on this a couple weeks earlier. >> >> Basically at a high level (over-simplified) this adapter/API will allow >> upstream apps to query RDBMS data over JDBC and get the JDBC objects >> converted to Arrow in-memory (JVM) objects/structures. The upstream utility >> can then work with Arrow objects/structures with usual performance benefits. >> The utility will be very much similar to C++ implementation of "Convert a >> vector of row-wise data into an Arrow table" as described here - >> https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html. >> >> How useful this adapter would be and which other Apache projects would >> benefit from this? Based on the usability we can open a JIRA for this >> activity and start looking into the implementation details. >> >> Regards, >> -Atul Dambalkar >> >>
RE: JDBC Adapter for Apache-Arrow
Hi Julian, Thanks for your response. If I understand correctly (looking at other adapters), Calcite-Arrow adapter would provide SQL front end for in-memory Arrow data objects/structures. So from that perspective, are you suggesting building the Calcite-Arrow adapter? In this case, what we are saying is to provide a mechanism for upstream apps to be able to get/create Arrow objects/structures from a relational database. This would also mean converting row like data from a SQL Database to columnar Arrow data structures. The utility may be, can make use of JDBC's MetaData features to figure out the underlying DB schema and define Arrow columnar schema. Also underlying database in this case would be any relational DB and hence would be persisted to the disk, but the Arrow objects being in-memory can be ephemeral. Please correct me if I am missing anything. -Atul -Original Message- From: Julian Hyde [mailto:jhyde.apa...@gmail.com] Sent: Monday, October 30, 2017 7:50 PM To: dev@arrow.apache.org Subject: Re: JDBC Adapter for Apache-Arrow How about writing an Arrow adapter for Calcite? I think it amounts to the same thing - you would inherit Calcite’s SQL parser and Avatica JDBC stack. Would this database be ephemeral (i.e. would the data go away when you close the connection)? If not, how would you know where to load the data from? Julian > On Oct 30, 2017, at 6:17 PM, Atul Dambalkar <atul.dambal...@xoriant.com> > wrote: > > Hi all, > > I wanted to open up a conversation here regarding developing a Java-based > JDBC Adapter for Apache Arrow. I have had a preliminary discussion with Wes > McKinney and Siddharth Teotia on this a couple weeks earlier. > > Basically at a high level (over-simplified) this adapter/API will allow > upstream apps to query RDBMS data over JDBC and get the JDBC objects > converted to Arrow in-memory (JVM) objects/structures. The upstream utility > can then work with Arrow objects/structures with usual performance benefits. > The utility will be very much similar to C++ implementation of "Convert a > vector of row-wise data into an Arrow table" as described here - > https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html. > > How useful this adapter would be and which other Apache projects would > benefit from this? Based on the usability we can open a JIRA for this > activity and start looking into the implementation details. > > Regards, > -Atul Dambalkar > >
Re: JDBC Adapter for Apache-Arrow
How about writing an Arrow adapter for Calcite? I think it amounts to the same thing - you would inherit Calcite’s SQL parser and Avatica JDBC stack. Would this database be ephemeral (i.e. would the data go away when you close the connection)? If not, how would you know where to load the data from? Julian > On Oct 30, 2017, at 6:17 PM, Atul Dambalkar <atul.dambal...@xoriant.com> > wrote: > > Hi all, > > I wanted to open up a conversation here regarding developing a Java-based > JDBC Adapter for Apache Arrow. I have had a preliminary discussion with Wes > McKinney and Siddharth Teotia on this a couple weeks earlier. > > Basically at a high level (over-simplified) this adapter/API will allow > upstream apps to query RDBMS data over JDBC and get the JDBC objects > converted to Arrow in-memory (JVM) objects/structures. The upstream utility > can then work with Arrow objects/structures with usual performance benefits. > The utility will be very much similar to C++ implementation of "Convert a > vector of row-wise data into an Arrow table" as described here - > https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html. > > How useful this adapter would be and which other Apache projects would > benefit from this? Based on the usability we can open a JIRA for this > activity and start looking into the implementation details. > > Regards, > -Atul Dambalkar > >
JDBC Adapter for Apache-Arrow
Hi all, I wanted to open up a conversation here regarding developing a Java-based JDBC Adapter for Apache Arrow. I have had a preliminary discussion with Wes McKinney and Siddharth Teotia on this a couple weeks earlier. Basically at a high level (over-simplified) this adapter/API will allow upstream apps to query RDBMS data over JDBC and get the JDBC objects converted to Arrow in-memory (JVM) objects/structures. The upstream utility can then work with Arrow objects/structures with usual performance benefits. The utility will be very much similar to C++ implementation of "Convert a vector of row-wise data into an Arrow table" as described here - https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html. How useful this adapter would be and which other Apache projects would benefit from this? Based on the usability we can open a JIRA for this activity and start looking into the implementation details. Regards, -Atul Dambalkar