[
https://issues.apache.org/jira/browse/CASSANDRA-1898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985805#action_12985805
]
Jonathan Ellis commented on CASSANDRA-1898:
-------------------------------------------
somewhat against my better judgement, nick convinced me to leave the super
ghetto attempt-to-skip-corrupt-rows code alone. so let's fix that comma there,
for this ticket.
created CASSANDRA-2041 for large row support, since that is a new feature.
> json2sstable should support streaming
> -------------------------------------
>
> Key: CASSANDRA-1898
> URL: https://issues.apache.org/jira/browse/CASSANDRA-1898
> Project: Cassandra
> Issue Type: Improvement
> Components: Tools
> Reporter: Nick Bailey
> Assignee: Pavel Yaskevich
> Fix For: 0.7.1
>
> Attachments: CASSANDRA-1898-v2.patch, CASSANDRA-1898-v3.patch,
> CASSANDRA-1898-v4.patch, CASSANDRA-1898.patch
>
> Original Estimate: 8h
> Time Spent: 8h
> Remaining Estimate: 0h
>
> json2sstable loads the entire json file into memory. This is so it can sort
> the file before creating an sstable. If the file was created using
> sstable2json and the partitioner isn't changing, this isn't necessary. For
> very large files this means json2sstable requires a huge amount of memory.
> There should be an option to stream the file. A simple check for out of order
> keys will prevent writing bad sstables.
> This should be possible with the SAX style parser available in our current
> json library.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.