[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-09-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181283#comment-16181283
 ] 

ASF GitHub Bot commented on ORC-226:


Github user omalley commented on the issue:

https://github.com/apache/orc/pull/151
  
Ok, this is generally good. A couple of points that I'll fix as part of 
committing:

* the API should use specific types (eg. uint32_t instead of int)
* since the enums are for serialization, I think it will be clearer if we 
assign explicit values for them
* if the footer doesn't have a writerId, it is ORC_JAVA_WRITER and not an 
exception



> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-09-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16173715#comment-16173715
 ] 

ASF GitHub Bot commented on ORC-226:


Github user xndai commented on the issue:

https://github.com/apache/orc/pull/151
  
Squash commit. Thanks @ajayyadava @majetideepak 


> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-09-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16169045#comment-16169045
 ] 

ASF GitHub Bot commented on ORC-226:


Github user ajayyadava commented on the issue:

https://github.com/apache/orc/pull/151
  
@xndai You can use the following command to squash your last K commits into 
1:
git reset --soft HEAD~K && git commit
e.g. to squash last 2 commits into one, replace K with 2
git reset --soft HEAD~2 && git commit

I will advise that you create a separate branch from the branch you want to 
commit and try it on that first.



> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-09-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16167828#comment-16167828
 ] 

ASF GitHub Bot commented on ORC-226:


Github user xndai commented on the issue:

https://github.com/apache/orc/pull/151
  
@majetideepak how do I do that? :)


> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-09-13 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16164817#comment-16164817
 ] 

ASF GitHub Bot commented on ORC-226:


Github user majetideepak commented on the issue:

https://github.com/apache/orc/pull/151
  
@xndai  can you please squash your commits? 


> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-09-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16151529#comment-16151529
 ] 

ASF GitHub Bot commented on ORC-226:


Github user xndai commented on a diff in the pull request:

https://github.com/apache/orc/pull/151#discussion_r136697977
  
--- Diff: c++/include/orc/Reader.hh ---
@@ -288,6 +288,17 @@ namespace orc {
 virtual uint64_t getCompressionSize() const = 0;
 
 /**
+ * Get ID of writer that generated the file.
+ * Current availiable Orc writers:
+ * 0 = ORC Java
+ * 1 = ORC C++
+ * 2 = Presto
+ * @param id out parameter for writer id
+ * @return true if writer id is availiable, false if otherwise
+ */
+virtual bool getWriterId(uint32_t & id) const = 0;
--- End diff --

ok, that sounds good. I will update it in my next change.


> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-08-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16146203#comment-16146203
 ] 

ASF GitHub Bot commented on ORC-226:


Github user omalley commented on a diff in the pull request:

https://github.com/apache/orc/pull/151#discussion_r135922340
  
--- Diff: c++/include/orc/Reader.hh ---
@@ -288,6 +288,17 @@ namespace orc {
 virtual uint64_t getCompressionSize() const = 0;
 
 /**
+ * Get ID of writer that generated the file.
+ * Current availiable Orc writers:
+ * 0 = ORC Java
+ * 1 = ORC C++
+ * 2 = Presto
+ * @param id out parameter for writer id
+ * @return true if writer id is availiable, false if otherwise
+ */
+virtual bool getWriterId(uint32_t & id) const = 0;
--- End diff --

Actually, we do have the error stream, but we don't need to use it. I'd 
prefer not to use side effects in the parameters, because it is easy to misread 
in the code.

For the code and most users, the important part is which writer it is if 
known. Only diagnostic tools will care about the integer value of unknown ones. 
How about:

enum WriterId { ORC_JAVA_WRITER, ORC_CPP_WRITER, PRESTO_WRITER, 
UNKNOWN_WRITER}

class Reader {

   WriterId getWriterId();
   /**
 * For unknown writer ids, get the value.
 */
   int getUnknownWriterIdValue();




> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-08-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16128069#comment-16128069
 ] 

ASF GitHub Bot commented on ORC-226:


Github user xndai commented on a diff in the pull request:

https://github.com/apache/orc/pull/151#discussion_r10343
  
--- Diff: c++/include/orc/Reader.hh ---
@@ -288,6 +288,17 @@ namespace orc {
 virtual uint64_t getCompressionSize() const = 0;
 
 /**
+ * Get ID of writer that generated the file.
+ * Current availiable Orc writers:
+ * 0 = ORC Java
+ * 1 = ORC C++
+ * 2 = Presto
+ * @param id out parameter for writer id
+ * @return true if writer id is availiable, false if otherwise
+ */
+virtual bool getWriterId(uint32_t & id) const = 0;
--- End diff --

Then we need a logger in Orc library which we don't have now. What's the 
problem of returning actual value instead of ORC_FUTURE?


> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-08-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16126017#comment-16126017
 ] 

ASF GitHub Bot commented on ORC-226:


Github user xndai commented on a diff in the pull request:

https://github.com/apache/orc/pull/151#discussion_r133008869
  
--- Diff: c++/include/orc/Reader.hh ---
@@ -288,6 +288,17 @@ namespace orc {
 virtual uint64_t getCompressionSize() const = 0;
 
 /**
+ * Get ID of writer that generated the file.
+ * Current availiable Orc writers:
+ * 0 = ORC Java
+ * 1 = ORC C++
+ * 2 = Presto
+ * @param id out parameter for writer id
+ * @return true if writer id is availiable, false if otherwise
+ */
+virtual bool getWriterId(uint32_t & id) const = 0;
--- End diff --

Ok, I see. But I am still not sure about returning ORC_FUTURE. The caller 
may want to obtain the exact writer ID for debugging purpose. It can choose to 
ignore if it wants. We just give them one more option. What do you think?


> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123629#comment-16123629
 ] 

ASF GitHub Bot commented on ORC-226:


Github user omalley commented on a diff in the pull request:

https://github.com/apache/orc/pull/151#discussion_r132734767
  
--- Diff: c++/include/orc/Reader.hh ---
@@ -288,6 +288,17 @@ namespace orc {
 virtual uint64_t getCompressionSize() const = 0;
 
 /**
+ * Get ID of writer that generated the file.
+ * Current availiable Orc writers:
+ * 0 = ORC Java
+ * 1 = ORC C++
+ * 2 = Presto
+ * @param id out parameter for writer id
+ * @return true if writer id is availiable, false if otherwise
+ */
+virtual bool getWriterId(uint32_t & id) const = 0;
--- End diff --

Re-reading your question, I should clarify that the Presto writer hasn't 
been released yet. 


> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-08-11 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16123627#comment-16123627
 ] 

ASF GitHub Bot commented on ORC-226:


Github user omalley commented on a diff in the pull request:

https://github.com/apache/orc/pull/151#discussion_r132734446
  
--- Diff: c++/include/orc/Reader.hh ---
@@ -288,6 +288,17 @@ namespace orc {
 virtual uint64_t getCompressionSize() const = 0;
 
 /**
+ * Get ID of writer that generated the file.
+ * Current availiable Orc writers:
+ * 0 = ORC Java
+ * 1 = ORC C++
+ * 2 = Presto
+ * @param id out parameter for writer id
+ * @return true if writer id is availiable, false if otherwise
+ */
+virtual bool getWriterId(uint32_t & id) const = 0;
--- End diff --

No, it won't. Dain from the Presto team was the one that suggested adding 
the field.


> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-08-10 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16122359#comment-16122359
 ] 

ASF GitHub Bot commented on ORC-226:


Github user omalley commented on a diff in the pull request:

https://github.com/apache/orc/pull/151#discussion_r132571311
  
--- Diff: c++/include/orc/Reader.hh ---
@@ -288,6 +288,17 @@ namespace orc {
 virtual uint64_t getCompressionSize() const = 0;
 
 /**
+ * Get ID of writer that generated the file.
+ * Current availiable Orc writers:
+ * 0 = ORC Java
+ * 1 = ORC C++
+ * 2 = Presto
+ * @param id out parameter for writer id
+ * @return true if writer id is availiable, false if otherwise
+ */
+virtual bool getWriterId(uint32_t & id) const = 0;
--- End diff --

If it isn't there, it means ORC Java. Let's make an enum that matches this:

enum WriterId { ORC_Java = 0, ORC_CPP = 1, ORC_Presto = 2, ORC_FUTURE = 
INT_MAX }

Any unknown code should return ORC_FUTURE.

   virtual WriterId getWriterId() const = 0;


> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (ORC-226) Support getWriterId in c++ reader interface

2017-08-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16120869#comment-16120869
 ] 

ASF GitHub Bot commented on ORC-226:


GitHub user xndai opened a pull request:

https://github.com/apache/orc/pull/151

ORC-226 Support getWriterId in c++ reader interface

Add new interface for reader to retrieve writer ID.

Change-Id: I87939e448aa5eab1bc7ed728404ddbf41334d809

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xndai/orc dev_writerid

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/orc/pull/151.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #151


commit 0269e8a7df72b34f0af83f668f21ac627e00f6d3
Author: Xiening.Dai 
Date:   2017-08-10T00:09:43Z

ORC-226 Support getWriterId in c++ reader interface

Add new interface for reader to retrieve writer ID.

Change-Id: I87939e448aa5eab1bc7ed728404ddbf41334d809




> Support getWriterId in c++ reader interface
> ---
>
> Key: ORC-226
> URL: https://issues.apache.org/jira/browse/ORC-226
> Project: ORC
>  Issue Type: Sub-task
>  Components: C++
>Reporter: Xiening Dai
>
> We just added writer ID to identify files generated by different writers (we 
> have three currently). Need an interface for reader to get this ID back.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)