[ 
https://issues.apache.org/jira/browse/BEAM-214?focusedWorklogId=102184&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-102184
 ]

ASF GitHub Bot logged work on BEAM-214:
---------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/May/18 16:22
            Start Date: 15/May/18 16:22
    Worklog Time Spent: 10m 
      Work Description: lgajowy commented on issue #5242: [BEAM-214] ParquetIO
URL: https://github.com/apache/beam/pull/5242#issuecomment-389227867
 
 
   Ok, I posted some new commits. They include:
   - writing slices of bytes
   - cleanup of redundant code
   - jenkins job for the IT (sorry, I somehow forgot to commit this earlier)
   - applying other suggestions
   
   As I mentioned in comments, some more investigation is needed regarding 
reading. I can work on this in this PR or in some consequent one (which option 
do you think is best?)
   
   Besides writing slices of bytes, the writing part can also be further 
optimized by adding the `blockSizeHint` and enabling block size support. 
However, I didn't find any way to reach `Filesystem.getScheme()` in `open()` 
method which would allow me to set `defaultBlockSize()` based on filesystem 
type. Maybe you have any hints about this? 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 102184)
    Time Spent: 9h 10m  (was: 9h)

> Create Parquet IO
> -----------------
>
>                 Key: BEAM-214
>                 URL: https://issues.apache.org/jira/browse/BEAM-214
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-ideas
>            Reporter: Neville Li
>            Assignee: Jean-Baptiste Onofré
>            Priority: Minor
>          Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Would be nice to support Parquet files with projection and predicates.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to