[
https://issues.apache.org/jira/browse/CASSANDRA-8404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14388571#comment-14388571
]
Philip Thompson commented on CASSANDRA-8404:
--------------------------------------------
Have you had the same issue on 2.1.3?
> CQLSSTableLoader can not create SSTable for csv file of 10M rows.
> -----------------------------------------------------------------
>
> Key: CASSANDRA-8404
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8404
> Project: Cassandra
> Issue Type: Bug
> Environment: I am using Cassandra 2.1.1 on 32 bit Ubuntu 12.04. I am
> running the program with -Xmx1000M
> manish@manish[~]:> uname -a
> Linux manish 3.2.0-72-generic-pae #107-Ubuntu SMP Thu Nov 6 14:44:10 UTC 2014
> i686 i686 i386 GNU/Linux
> Reporter: Manish
> Fix For: 2.1.4
>
> Attachments: Test1.java, cassandra.yaml
>
>
> I am able to create SSTable for one file of 10M rows but not for other file.
> The data file which works is subscribers1.gz and data file which does not
> work is subscriber2.gz. Both files have same values in first column but
> different values for second column. I wonder why CQLSSTableLoader does not
> work for different set of data.
> Program expected unzipped txt files. So please unzip files before running
> program. What I have observed is High GC when program processes around 5.2M
> lines of file subscriber2.gz. It is able to process till 5.8M lines with very
> frequent Full GC runs. It is not able to process beyond 5.8M rows because of
> memory not being available.
> I have attached Test1.java and cassandra.yaml I used for creating sstable. In
> classpath I am specifying all jars of lib folder of extracted
> apache-cassandra-2.1.1-bin.tar.gz
> Jira does not allow a file of size greater than 10 MB. So I am sharing data
> files in google drive.
> link to download subscribers1.gz
> https://drive.google.com/file/d/0B6_-ugKWlrfoOTRTa2FCNTFWU2c/view?usp=sharing
> link to download subscribers2.gz
> https://drive.google.com/file/d/0B6_-ugKWlrfocndycm9yM21rN0E/view?usp=sharing
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)