gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-859381992
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-858568042
@eadwright, sorry, you're right. This is not tightly related to your PR.
Please, remove the try-catch blocks for OOE and put an `@Ignore` annotation to
the test class for
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-858478591
@eadwright, I understand your concerns I don't really like it either.
Meanwhile, I don't feel good having a test that is not executed automatically.
Without regular
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-851298938
@eadwright, I've made some changes in the unit test (no more TODOs). See the
update
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-849434359
@eadwright, so the CI was
[executed](https://github.com/gszadovszky/parquet-mr/actions/runs/879304911)
somehow on my private repo and failed due to OoM. So, we may either
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-848941568
@eadwright, I've implemented a [unit
test](https://github.com/gszadovszky/parquet-mr/commit/fcaf41269470c03c088b7eb5598558d44013f59d)
to reproduce the issue and test your
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-847669598
@eadwright, I'll try to look into this this week and produce a java code to
reproduce the issue.
--
This is an automated message from the Apache Git Service.
To respond
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-844187674
Thanks, @eadwright for explaining. I get it now.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-84394
@eadwright, what do you mean by "necessary for the rows to spill over into a
second row group."? It shall not be possible. Even the pages keep row
boundaries but for row
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-841139504
@advancedxy, thanks for explaining.
I think, the best option is 2. It is up to the user to provide enough
resources for handling the large row groups or not writing
gszadovszky commented on pull request #902:
URL: https://github.com/apache/parquet-mr/pull/902#issuecomment-836325628
@eadwright,
I'll try to summarize the issue, please correct me if I'm wrong. Parquet-mr
is not able to write such big row groups (>2GB) because of the `int` array
11 matches
Mail list logo