[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058723#comment-17058723 ] Igor Guzenko commented on DRILL-4455: - Hi [~lpathy], we had a long discussion about the project and decided to abandon it. Since the migration requires to rewrite a huge amount of execution-related code and brings risks of breaking a lot of things. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0, 1.17.0 >Reporter: Steven Phillips >Assignee: Igor Guzenko >Priority: Major > Labels: doc-impacting > Fix For: 1.18.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058463#comment-17058463 ] Lakshmi Pathy S N commented on DRILL-4455: -- Hi Igor Guzenko, were you able to start work in this task... > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0, 1.17.0 >Reporter: Steven Phillips >Assignee: Igor Guzenko >Priority: Major > Labels: doc-impacting > Fix For: 1.18.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991480#comment-16991480 ] Igor Guzenko commented on DRILL-4455: - Hello [~liuqiyun2019], I'll start working on this task as soon as I'm done with DRILL-7406. There is no concrete schedule because it requires a big amount of work. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0, 1.17.0 >Reporter: Steven Phillips >Assignee: Igor Guzenko >Priority: Major > Labels: doc-impacting > Fix For: 1.18.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16991171#comment-16991171 ] liuqiyun2019 commented on DRILL-4455: - Is there any specific schedule on the implementation of this task? We are really looking forward to this improvement. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0, 1.17.0 >Reporter: Steven Phillips >Assignee: Igor Guzenko >Priority: Major > Labels: doc-impacting > Fix For: 1.18.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951349#comment-16951349 ] ASF GitHub Bot commented on DRILL-4455: --- arina-ielchiieva commented on pull request #398: DRILL-4455: Depend on Apache Arrow URL: https://github.com/apache/drill/pull/398 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0, 1.17.0 >Reporter: Steven Phillips >Assignee: Igor Guzenko >Priority: Major > Labels: doc-impacting > Fix For: 1.18.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16951348#comment-16951348 ] ASF GitHub Bot commented on DRILL-4455: --- arina-ielchiieva commented on issue #398: DRILL-4455: Depend on Apache Arrow URL: https://github.com/apache/drill/pull/398#issuecomment-541910950 Closing PR as outdated, new PR will be openned, see [DRILL-4455](https://issues.apache.org/jira/browse/DRILL-4455) for details. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0, 1.17.0 >Reporter: Steven Phillips >Assignee: Igor Guzenko >Priority: Major > Labels: doc-impacting > Fix For: 1.18.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597891#comment-16597891 ] Wes McKinney commented on DRILL-4455: - Great. Arrow has some departures from Drill's ValueVectors, like using 1 bit for validity instead of 1 byte, but otherwise I don't think it would be too difficult to "cast" from one to the other. Obviously it would be great for Drill to use Arrow natively at some point > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Steven Phillips >Priority: Major > Fix For: Future > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595977#comment-16595977 ] Kunal Khatua commented on DRILL-4455: - There has been some talk on the dev list [Link|https://lists.apache.org/thread.html/8d895fb40702f3120532f15594ea935a818ac0eb5acdb4fd1248d89f@%3Cdev.drill.apache.org%3E] > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Steven Phillips >Priority: Major > Fix For: Future > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16527782#comment-16527782 ] Wes McKinney commented on DRILL-4455: - Having personally spent about 3 years of my life now focused on making Apache Arrow a cross-platform/language technology for in-memory analytics to tie the world together, I would really like to see Drill become a first-class citizen in the Arrow ecosystem. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips >Priority: Major > Fix For: Future > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506675#comment-16506675 ] ASF GitHub Bot commented on DRILL-4455: --- ilooner commented on issue #398: DRILL-4455: Depend on Apache Arrow URL: https://github.com/apache/drill/pull/398#issuecomment-395915947 @vrozov Since you are picking up Drillbuf design work, please advise on the plan for this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips >Priority: Major > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15743883#comment-15743883 ] Jacques Nadeau commented on DRILL-4455: --- +1 on Parth's suggestion. It would be great for someone to pick this up and work on it in a 2.0 branch. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742666#comment-15742666 ] Julian Hyde commented on DRILL-4455: [~jnadeau] and [~sphillips], Can you respond to [~parthc]'s suggestion? I don't want this rapprochement to go cold. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15742594#comment-15742594 ] Parth Chandra commented on DRILL-4455: -- For the record, I'm in favour of integrating Drill and Arrow. I still feel that Drill should manage its own memory, but I'm open to being convinced otherwise. If folks are in agreement, we can create a 2.0 branch (because this is a breaking change) and work on that branch. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15706607#comment-15706607 ] Julian Hyde commented on DRILL-4455: [~jnadeau], [~sphillips], [~amansinha100] and [~parthc], Can all parties please agree (and state publicly for the record) that moving value vector code out of Drill and into Arrow is in the best interests of the Drill project? Most contributions can be managed by a process of submitting a patch, review, reject, revise, and repeat. But this is not one of those patches that can be casually kicked back to the contributor. It is a huge, because it is an architectural change. I would like to see a commitment from both sides (contributor and reviewer) that we will find consensus and accept the patch. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15687300#comment-15687300 ] Jacques Nadeau commented on DRILL-4455: --- As someone who worked a lot on the memory layer and accounting stuff, I'm not sure how one would split it without introducing a level of indirection that would impact performance. The problem has to do with the ability to transfer data accounting that exists within the memory buffers and trying to do that while maintaining a single canonical memory representation and supporting limits. For reference, please review the information at [1] to understand how the pieces work together. We have two challenges I see at this point. - This was originally proposed in November of 15. Note the attached slides in [2], specifically the last one where all three approaches included the vectors and memory management moving together in the project (due to the nature of the coupling). Not hearing any disagreement and then going through the massive amount of work that this patch took to build and then hitting a -1 6 months later takes a lot of wind out of one's sails. - The larger problem is I'm not sure who is going to have the interest to try to do this patch again. We're now ~6 months later with two trees that have moved in their own directions. Rebase is probably very difficult (or impossible). My sense is that Arrow will continue to create value and at some point, the Drill community will achieve a consensus that it is valuable to do this work. In the meantime, I'm not sure anyone's heart is in it right now. So while it may make sense to ultimately try to come up with a better approach to modularity in the Arrow library around the first point, I'd like to see some demand from the community that wants to use Arrow to do that (possibly in the form of patches or approaches proposed). PS: An interesting question would be: how much development has happened in the "disputed module" in Drill since this patch (or since my major reworking of it ~12 months ago). [1] https://github.com/apache/arrow/tree/master/java/memory/src/main/java/org/apache/arrow/memory [2] http://markmail.org/thread/74ns3peuwbaolcod > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685763#comment-15685763 ] Julian Hyde commented on DRILL-4455: [~sphillips], [~jnadeau], What do you think about memory management/accounting? Is there potentially a compromise, say an an interface for accounting that could be called by Arrow? > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685384#comment-15685384 ] Aman Sinha commented on DRILL-4455: --- Moving the memory format out of Drill to a more specialized store such as Arrow is a benefit; however moving the memory management/accounting part is probably not. This was one of the previous comments; I am not familiar with the Arrow release but would support someone doing a feasibility analysis of drill + arrow. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15685135#comment-15685135 ] Julian Hyde commented on DRILL-4455: Now that Arrow has had a release, should we re-visit this? I have no dog in this race (or maybe I have two, as a PMC member of both Drill and Arrow), but it seems to me that moving the memory format out of Drill is to Drill's benefit. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15259205#comment-15259205 ] ASF GitHub Bot commented on DRILL-4455: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/398#issuecomment-214922608 Sounds like we need to do some more work to mature and stabilize the Arrow code. Let's pick this up again once we've made more progress on that front. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15256561#comment-15256561 ] ASF GitHub Bot commented on DRILL-4455: --- Github user parthchandra commented on the pull request: https://github.com/apache/drill/pull/398#issuecomment-214433058 I have two problems with this patch (apart from it being just too big to review) 1) Building against 0.1-SNAPSHOT of a project that is in rapid development is a courageous decision (I would rather not board a train while it is moving at full speed). The current patch also does not build as there is no artifact for Arrow 0.1-SNAPSHOT. 2) I see that the entire memory module of Drill is gone. I don't think that drill memory management and accounting should be outsourced to Arrow. I can see needing to use ArrowBuf for value vectors but removing the entire memory module is not OK. I can see other uses of Direct/Heap memory (caching metadata for example) that should be managed under a Drill resource manager. This ties back to #1. Arrow needs to get its design figured out to allow Drill and other projects to manage their resources while providing a 'reference' implementation. -1 until we resolve this. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15252940#comment-15252940 ] ASF GitHub Bot commented on DRILL-4455: --- Github user StevenMPhillips commented on the pull request: https://github.com/apache/drill/pull/398#issuecomment-213149221 I just rebased the patch and broke it up into 5 different patches. > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 2.0.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4455) Depend on Apache Arrow for Vector and Memory
[ https://issues.apache.org/jira/browse/DRILL-4455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15173072#comment-15173072 ] ASF GitHub Bot commented on DRILL-4455: --- GitHub user StevenMPhillips opened a pull request: https://github.com/apache/drill/pull/398 DRILL-4455: Depend on Apache Arrow Remove the vector and memory modules Replace most instances of TypeProtos.MajorType, TypeProtos.MinorType etc with corresponding org.apache.arrow.types.Types class. The code for serializing and deserializing vectors is still in the Drill code-base and still uses the old TypeProtos classes. The class MajorTypeHelper is used to convert between the two sets of classes. The old load and getMetadata classes in ValueVector are now found in a corresponding ValueVectorHelper class You can merge this pull request into a Git repository by running: $ git pull https://github.com/StevenMPhillips/drill arrow9 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/398.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #398 commit 91790c43f2939a028e05f8c67093d43e96fe9caa Author: Steven Phillips Date: 2016-03-01T01:05:23Z DRILL-4455: Depend on Apache Arrow Remove the vector and memory modules Replace most instances of TypeProtos.MajorType, TypeProtos.MinorType etc with corresponding org.apache.arrow.types.Types class. The code for serializing and deserializing vectors is still in the Drill code-base and still uses the old TypeProtos classes. The class MajorTypeHelper is used to convert between the two sets of classes. The old load and getMetadata classes in ValueVector are now found in a corresponding ValueVectorHelper class > Depend on Apache Arrow for Vector and Memory > > > Key: DRILL-4455 > URL: https://issues.apache.org/jira/browse/DRILL-4455 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 1.7.0 > > > The code for value vectors and memory has been split and contributed to the > apache arrow repository. In order to help this project advance, Drill should > depend on the arrow project instead of internal value vector code. > This change will require recompiling any external code, such as UDFs and > StoragePlugins. The changes will mainly just involve renaming the classes to > the org.apache.arrow namespace. -- This message was sent by Atlassian JIRA (v6.3.4#6332)