[
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496977
]
Owen O'Malley commented on HADOOP-1381:
---
As long as it is configurable, 100k as the default would be fine.
[
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496968
]
Owen O'Malley commented on HADOOP-1381:
---
The current setup forces all non-block compressed sequence files to
[
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496973
]
Doug Cutting commented on HADOOP-1381:
--
This sounds like a time/space tradeoff. We currently have a 1% space
[
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496435
]
Doug Cutting commented on HADOOP-1381:
--
reduce the overhead by a factor of 500
But if the overhead is
[
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496429
]
Doug Cutting commented on HADOOP-1381:
--
Why would this be better? The current design is to add them as
[
https://issues.apache.org/jira/browse/HADOOP-1381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496432
]
Owen O'Malley commented on HADOOP-1381:
---
If your input splits are roughly 128MB or so, putting in a sync