[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181283#comment-16181283 ] ASF GitHub Bot commented on ORC-226: Github user omalley commented on the issue: https://github.com/apache/orc/pull/151 Ok, this is generally good. A couple of points that I'll fix as part of committing: * the API should use specific types (eg. uint32_t instead of int) * since the enums are for serialization, I think it will be clearer if we assign explicit values for them * if the footer doesn't have a writerId, it is ORC_JAVA_WRITER and not an exception > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173715#comment-16173715 ] ASF GitHub Bot commented on ORC-226: Github user xndai commented on the issue: https://github.com/apache/orc/pull/151 Squash commit. Thanks @ajayyadava @majetideepak > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169045#comment-16169045 ] ASF GitHub Bot commented on ORC-226: Github user ajayyadava commented on the issue: https://github.com/apache/orc/pull/151 @xndai You can use the following command to squash your last K commits into 1: git reset --soft HEAD~K && git commit e.g. to squash last 2 commits into one, replace K with 2 git reset --soft HEAD~2 && git commit I will advise that you create a separate branch from the branch you want to commit and try it on that first. > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167828#comment-16167828 ] ASF GitHub Bot commented on ORC-226: Github user xndai commented on the issue: https://github.com/apache/orc/pull/151 @majetideepak how do I do that? :) > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164817#comment-16164817 ] ASF GitHub Bot commented on ORC-226: Github user majetideepak commented on the issue: https://github.com/apache/orc/pull/151 @xndai can you please squash your commits? > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151529#comment-16151529 ] ASF GitHub Bot commented on ORC-226: Github user xndai commented on a diff in the pull request: https://github.com/apache/orc/pull/151#discussion_r136697977 --- Diff: c++/include/orc/Reader.hh --- @@ -288,6 +288,17 @@ namespace orc { virtual uint64_t getCompressionSize() const = 0; /** + * Get ID of writer that generated the file. + * Current availiable Orc writers: + * 0 = ORC Java + * 1 = ORC C++ + * 2 = Presto + * @param id out parameter for writer id + * @return true if writer id is availiable, false if otherwise + */ +virtual bool getWriterId(uint32_t & id) const = 0; --- End diff -- ok, that sounds good. I will update it in my next change. > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146203#comment-16146203 ] ASF GitHub Bot commented on ORC-226: Github user omalley commented on a diff in the pull request: https://github.com/apache/orc/pull/151#discussion_r135922340 --- Diff: c++/include/orc/Reader.hh --- @@ -288,6 +288,17 @@ namespace orc { virtual uint64_t getCompressionSize() const = 0; /** + * Get ID of writer that generated the file. + * Current availiable Orc writers: + * 0 = ORC Java + * 1 = ORC C++ + * 2 = Presto + * @param id out parameter for writer id + * @return true if writer id is availiable, false if otherwise + */ +virtual bool getWriterId(uint32_t & id) const = 0; --- End diff -- Actually, we do have the error stream, but we don't need to use it. I'd prefer not to use side effects in the parameters, because it is easy to misread in the code. For the code and most users, the important part is which writer it is if known. Only diagnostic tools will care about the integer value of unknown ones. How about: enum WriterId { ORC_JAVA_WRITER, ORC_CPP_WRITER, PRESTO_WRITER, UNKNOWN_WRITER} class Reader { WriterId getWriterId(); /** * For unknown writer ids, get the value. */ int getUnknownWriterIdValue(); > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128069#comment-16128069 ] ASF GitHub Bot commented on ORC-226: Github user xndai commented on a diff in the pull request: https://github.com/apache/orc/pull/151#discussion_r10343 --- Diff: c++/include/orc/Reader.hh --- @@ -288,6 +288,17 @@ namespace orc { virtual uint64_t getCompressionSize() const = 0; /** + * Get ID of writer that generated the file. + * Current availiable Orc writers: + * 0 = ORC Java + * 1 = ORC C++ + * 2 = Presto + * @param id out parameter for writer id + * @return true if writer id is availiable, false if otherwise + */ +virtual bool getWriterId(uint32_t & id) const = 0; --- End diff -- Then we need a logger in Orc library which we don't have now. What's the problem of returning actual value instead of ORC_FUTURE? > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126017#comment-16126017 ] ASF GitHub Bot commented on ORC-226: Github user xndai commented on a diff in the pull request: https://github.com/apache/orc/pull/151#discussion_r133008869 --- Diff: c++/include/orc/Reader.hh --- @@ -288,6 +288,17 @@ namespace orc { virtual uint64_t getCompressionSize() const = 0; /** + * Get ID of writer that generated the file. + * Current availiable Orc writers: + * 0 = ORC Java + * 1 = ORC C++ + * 2 = Presto + * @param id out parameter for writer id + * @return true if writer id is availiable, false if otherwise + */ +virtual bool getWriterId(uint32_t & id) const = 0; --- End diff -- Ok, I see. But I am still not sure about returning ORC_FUTURE. The caller may want to obtain the exact writer ID for debugging purpose. It can choose to ignore if it wants. We just give them one more option. What do you think? > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123629#comment-16123629 ] ASF GitHub Bot commented on ORC-226: Github user omalley commented on a diff in the pull request: https://github.com/apache/orc/pull/151#discussion_r132734767 --- Diff: c++/include/orc/Reader.hh --- @@ -288,6 +288,17 @@ namespace orc { virtual uint64_t getCompressionSize() const = 0; /** + * Get ID of writer that generated the file. + * Current availiable Orc writers: + * 0 = ORC Java + * 1 = ORC C++ + * 2 = Presto + * @param id out parameter for writer id + * @return true if writer id is availiable, false if otherwise + */ +virtual bool getWriterId(uint32_t & id) const = 0; --- End diff -- Re-reading your question, I should clarify that the Presto writer hasn't been released yet. > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123627#comment-16123627 ] ASF GitHub Bot commented on ORC-226: Github user omalley commented on a diff in the pull request: https://github.com/apache/orc/pull/151#discussion_r132734446 --- Diff: c++/include/orc/Reader.hh --- @@ -288,6 +288,17 @@ namespace orc { virtual uint64_t getCompressionSize() const = 0; /** + * Get ID of writer that generated the file. + * Current availiable Orc writers: + * 0 = ORC Java + * 1 = ORC C++ + * 2 = Presto + * @param id out parameter for writer id + * @return true if writer id is availiable, false if otherwise + */ +virtual bool getWriterId(uint32_t & id) const = 0; --- End diff -- No, it won't. Dain from the Presto team was the one that suggested adding the field. > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122359#comment-16122359 ] ASF GitHub Bot commented on ORC-226: Github user omalley commented on a diff in the pull request: https://github.com/apache/orc/pull/151#discussion_r132571311 --- Diff: c++/include/orc/Reader.hh --- @@ -288,6 +288,17 @@ namespace orc { virtual uint64_t getCompressionSize() const = 0; /** + * Get ID of writer that generated the file. + * Current availiable Orc writers: + * 0 = ORC Java + * 1 = ORC C++ + * 2 = Presto + * @param id out parameter for writer id + * @return true if writer id is availiable, false if otherwise + */ +virtual bool getWriterId(uint32_t & id) const = 0; --- End diff -- If it isn't there, it means ORC Java. Let's make an enum that matches this: enum WriterId { ORC_Java = 0, ORC_CPP = 1, ORC_Presto = 2, ORC_FUTURE = INT_MAX } Any unknown code should return ORC_FUTURE. virtual WriterId getWriterId() const = 0; > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface
[ https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120869#comment-16120869 ] ASF GitHub Bot commented on ORC-226: GitHub user xndai opened a pull request: https://github.com/apache/orc/pull/151 ORC-226 Support getWriterId in c++ reader interface Add new interface for reader to retrieve writer ID. Change-Id: I87939e448aa5eab1bc7ed728404ddbf41334d809 You can merge this pull request into a Git repository by running: $ git pull https://github.com/xndai/orc dev_writerid Alternatively you can review and apply these changes as the patch at: https://github.com/apache/orc/pull/151.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #151 commit 0269e8a7df72b34f0af83f668f21ac627e00f6d3 Author: Xiening.DaiDate: 2017-08-10T00:09:43Z ORC-226 Support getWriterId in c++ reader interface Add new interface for reader to retrieve writer ID. Change-Id: I87939e448aa5eab1bc7ed728404ddbf41334d809 > Support getWriterId in c++ reader interface > --- > > Key: ORC-226 > URL: https://issues.apache.org/jira/browse/ORC-226 > Project: ORC > Issue Type: Sub-task > Components: C++ >Reporter: Xiening Dai > > We just added writer ID to identify files generated by different writers (we > have three currently). Need an interface for reader to get this ID back. -- This message was sent by Atlassian JIRA (v6.4.14#64029)