Re: Parquet without Hadoop dependencies

2023-06-09 Thread Atour Mousavi Gourabi
Hi Gang, I don't think it's feasible to make a new module for it this way as a lot of the support for this part of the code (codecs, etc.) resides in parquet-hadoop. This means the module would likely require a dependency on parquet-hadoop, making it pretty useless. This could be avoided by

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-09 Thread Raphael Taylor-Davies (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17731001#comment-17731001 ] Raphael Taylor-Davies commented on PARQUET-: Within arrow-rs levels data is always

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-09 Thread Xuwei Fu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730992#comment-17730992 ] Xuwei Fu commented on PARQUET-: --- [~gszadovszky] Hi Gabor maybe I'm misleading. 1. RLE for

[jira] [Comment Edited] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730988#comment-17730988 ] Gabor Szadovszky edited comment on PARQUET- at 6/9/23 2:40 PM: ---

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730988#comment-17730988 ] Gabor Szadovszky commented on PARQUET-: --- [~mwish], This is specifically about BOOLEAN

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-09 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730985#comment-17730985 ] Gang Wu commented on PARQUET-: -- [~mwish] Do you have any idea how does the rust or go

Re: Parquet without Hadoop dependencies

2023-06-09 Thread Gang Wu
That may break many downstream projects. At least we cannot break parquet-hadoop (and any existing module). If you can add a new module like parquet-core and provide limited reader/writer features without hadoop support, and then make parquet-hadoop depend on parquet-core, that would be

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-09 Thread Xuwei Fu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730983#comment-17730983 ] Xuwei Fu commented on PARQUET-: --- I think in cpp, in 12.0.0, even if it's Format V1, writer can

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-09 Thread Gang Wu (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730982#comment-17730982 ] Gang Wu commented on PARQUET-: -- Thanks for bringing this up [~gszadovszky]  I will look into it

[jira] [Commented] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730904#comment-17730904 ] Gabor Szadovszky commented on PARQUET-: --- [~apitrou], [~wgtmac], It seems my review was

[jira] [Assigned] (PARQUET-2222) [Format] RLE encoding spec incorrect for v2 data pages

2023-06-09 Thread Gabor Szadovszky (Jira)
[ https://issues.apache.org/jira/browse/PARQUET-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Szadovszky reassigned PARQUET-: - Assignee: Gang Wu > [Format] RLE encoding spec incorrect for v2 data pages >

Re: Parquet without Hadoop dependencies

2023-06-09 Thread Atour Mousavi Gourabi
Hi Gang, Backward compatibility does indeed seem challenging here. Especially as I'd rather see the writers/readers moved out of parquet-hadoop after they've been decoupled. What are your thoughts on this? Best regards, Atour From: Gang Wu Sent: Friday, June