[jira] [Commented] (ORC-119) Create an abstraction named PhysicalWriter that abstracts where the Writer puts the bytes

2016-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15750047#comment-15750047
 ] 

Sergey Shelukhin commented on ORC-119:
--

Hmm, actually that seems to make the information about suppression of the 
streams inaccessible.

> Create an abstraction named PhysicalWriter that abstracts where the Writer 
> puts the bytes
> -
>
> Key: ORC-119
> URL: https://issues.apache.org/jira/browse/ORC-119
> Project: Orc
>  Issue Type: Bug
>  Components: Java
>Reporter: Owen O'Malley
>
> This is a forward port of HIVE-14453, which introduce PhysicalWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ORC-119) Create an abstraction named PhysicalWriter that abstracts where the Writer puts the bytes

2016-12-14 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15749216#comment-15749216
 ] 

Sergey Shelukhin commented on ORC-119:
--

1) How is appendDataStream used? If it's just called repeatedly with the same 
name where OutputReceiver.output would formerly be called, that makes sense.
2) Does compression really need to be in the same layer as encoding? I guess it 
makes sense for the current implementation.

> Create an abstraction named PhysicalWriter that abstracts where the Writer 
> puts the bytes
> -
>
> Key: ORC-119
> URL: https://issues.apache.org/jira/browse/ORC-119
> Project: Orc
>  Issue Type: Bug
>  Components: Java
>Reporter: Owen O'Malley
>
> This is a forward port of HIVE-14453, which introduce PhysicalWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ORC-119) Create an abstraction named PhysicalWriter that abstracts where the Writer puts the bytes

2016-12-14 Thread Owen O'Malley (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748911#comment-15748911
 ] 

Owen O'Malley commented on ORC-119:
---

The parts that broke the straight port are ORC-101's major changes to the bloom 
filters and the additional compression codecs. The compression codecs need to 
live in the writer rather than the implementation of the PhysicalWriter except 
for the compression codec used for the protobuf objects.

The PhysicalWriter API introduced in HIVE-14453 is clearly a work in progress 
and needs some more refinement. I believe the goal is to:

* provide a separation between the WriterImpl and the actual layout on disk
* provide the protobuf for the metadata for interpretation and modification

Based on that, we should remove any access to the underlying stream and 
position. 

Does this API meet the need?

https://gist.github.com/omalley/f5d7f8edd8fba47fd6e84c179568672d


> Create an abstraction named PhysicalWriter that abstracts where the Writer 
> puts the bytes
> -
>
> Key: ORC-119
> URL: https://issues.apache.org/jira/browse/ORC-119
> Project: Orc
>  Issue Type: Bug
>  Components: Java
>Reporter: Owen O'Malley
>
> This is a forward port of HIVE-14453, which introduce PhysicalWriter.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ORC-120) Create a backwards compatibility mode of ignoring names for evolution

2016-12-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/ORC-120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15748763#comment-15748763
 ] 

ASF GitHub Bot commented on ORC-120:


Github user asfgit closed the pull request at:

https://github.com/apache/orc/pull/72


> Create a backwards compatibility mode of ignoring names for evolution
> -
>
> Key: ORC-120
> URL: https://issues.apache.org/jira/browse/ORC-120
> Project: Orc
>  Issue Type: Task
>Reporter: Owen O'Malley
>
> ORC's schema evolution uses the column names when they are available. Hive 
> 2.1 uses a positional schema, so ORC should support a backward compatibility 
> mode for Hive users during the transition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)