[ 
https://issues.apache.org/jira/browse/CASSANDRA-7271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14006440#comment-14006440
 ] 

Michael Shuler commented on CASSANDRA-7271:
-------------------------------------------

For a purely functional example using the tools I know, cqlsh has no problem 
loading data from a CSV - I'm using trunk, at the moment, but this works the 
same on all versions.
{noformat}
mshuler@hana:~/tmp/7271/BulkLoadFiles$ ls
7271.java  cassandra.yaml  cql.txt  my.test.cql  test.csv
mshuler@hana:~/tmp/7271/BulkLoadFiles$ cat test.csv ; echo
1,2
3,4
mshuler@hana:~/tmp/7271/BulkLoadFiles$ cat my.test.cql 
CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '1'};

CREATE TABLE test.test (
    col1 int PRIMARY KEY,
    col2 int
);

COPY test.test (col1 , col2) FROM 'test.csv';

SELECT * from test.test;
mshuler@hana:~/tmp/7271/BulkLoadFiles$ echo "SOURCE 'my.test.cql';" | cqlsh
2 rows imported in 0.007 seconds.

 col1 | col2
------+------
    1 |    2
    3 |    4

(2 rows)
{noformat}

So that said, I'm not familiar with gradle, read through 7218 a bit, and the 
java snippet isn't quite enough for me to go on, personally, but I'll see if I 
can get some help with that from the devs.

> Bulk data loading in Cassandra causing OOM
> ------------------------------------------
>
>                 Key: CASSANDRA-7271
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-7271
>             Project: Cassandra
>          Issue Type: Bug
>            Reporter: Prasanth Gullapalli
>            Assignee: Michael Shuler
>         Attachments: BulkLoadFiles.zip
>
>
> I am trying to load data from a csv file into Cassandra table using 
> SSTableSimpleUnsortedWriter. As the latest maven cassandra dependencies have 
> some issues with it, I have taken the _next_ beta (rc) version cut as 
> suggested in CASSANDRA-7218. But after taking it, I am facing issues with 
> bulk data loading
> Here is the piece of code which loads data:
> {code:java}
> public void loadData(TableDefinition tableDefinition, InputStream 
> csvInputStream){
>     createDataInDBFormat(tableDefinition, csvInputStream);
>     Path dbFilePath = Paths.get(TEMP_DIR, keyspace, 
> tableDefinition.getName());
>     //BulkLoader.main(new 
> String[]{"-d","localhost",dbFilePath.toUri().getPath()});
>     try {
>         JMXServiceURL jmxUrl = new JMXServiceURL(String.format(
>                 "service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", cassandraHost, 
> cassandraJMXPort));
>         JMXConnector connector = JMXConnectorFactory.connect(jmxUrl, new 
> HashMap<String, Object>());
>         MBeanServerConnection mbeanServerConn = 
> connector.getMBeanServerConnection();
>         ObjectName name = new 
> ObjectName("org.apache.cassandra.db:type=StorageService");
>         StorageServiceMBean storageBean = JMX.newMBeanProxy(mbeanServerConn, 
> name, StorageServiceMBean.class);
>         storageBean.bulkLoad(dbFilePath.toUri().getPath());
>         connector.close();
>     } catch (IOException | MalformedObjectNameException e) {
>         e.printStacktrace()
>     }
>     FileUtils.deleteQuietly(dbFilePath.toFile());
> }
> private void createDataInDBFormat(TableDefinition tableDefinition, 
> InputStream csvInputStream) {
>     try(Reader reader = new InputStreamReader(csvInputStream)){
>         String tableName = tableDefinition.getName();
>         File directory = Paths.get(TEMP_DIR, keyspace, tableName).toFile();
>         directory.mkdirs();
>         String yamlPath = 
> "file:\\"+CASSANDRA_HOME+File.separator+"conf"+File.separator+"cassandra.yaml";
>         System.setProperty("cassandra.config", yamlPath);
>         SSTableSimpleUnsortedWriter writer = new SSTableSimpleUnsortedWriter(
>                 directory, new Murmur3Partitioner(), keyspace,
>                 tableName, AsciiType.instance, null, 10);
>         long timestamp = System.currentTimeMillis() * 1000;
>         CSVReader csvReader = new CSVReader(reader);
>         String[] colValues = null;
>         List<ColumnDefinition> columnDefinitions = 
> tableDefinition.getColumnDefinitions();
>         while((colValues = csvReader.readNext()) != null){
>             if(colValues.length != 0){
>                 writer.newRow(bytes(colValues[0]));
>                 for(int index = 1; index< colValues.length; index++){
>                     ColumnDefinition columnDefinition = 
> columnDefinitions.get(index);       
>                     writer.addColumn(bytes(columnDefinition.getName()), 
> bytes(colValues[index]), timestamp);
>                 }
>             }
>         }
>         csvReader.close();
>         writer.close();
>     } catch (IOException e) {
>         e.printStacktrace();
>     }
> }
> {code}
> On trying to run loadData, it is giving me the following exception:
> {code:xml}
> 11:23:18.035 [45742123@qtp-1703018180-0] ERROR 
> com.adaequare.common.config.TransactionPerRequestFilter.doInTransactionWithoutResult
>  39 - Problem in executing request : 
> [http://localhost:8081/mapro-engine/rest/masterdata/pumpData]. Cause 
> :org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: 
> Java heap space
> javax.servlet.ServletException: 
> org.glassfish.jersey.server.ContainerException: java.lang.OutOfMemoryError: 
> Java heap space
>         at 
> org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:392) 
> ~[jersey-container-servlet-core-2.6.jar:na]
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:381)
>  ~[jersey-container-servlet-core-2.6.jar:na]
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:344)
>  ~[jersey-container-servlet-core-2.6.jar:na]
>         at 
> org.glassfish.jersey.servlet.ServletContainer.service(ServletContainer.java:219)
>  ~[jersey-container-servlet-core-2.6.jar:na]
>         at 
> org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511) 
> ~[jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1166)
>  [jetty-6.1.25.jar:6.1.25]
>         at 
> com.adaequare.common.config.TransactionPerRequestFilter$1.doInTransactionWithoutResult(TransactionPerRequestFilter.java:37)
>  ~[mapro-commons-1.0.jar:na]
>         at 
> org.springframework.transaction.support.TransactionCallbackWithoutResult.doInTransaction(TransactionCallbackWithoutResult.java:34)
>  [spring-tx-4.0.3.RELEASE.jar:4.0.3.RELEASE]
> :4.0.3.RELEASE]
>         at 
> org.springframework.transaction.support.TransactionTemplate.execute(TransactionTemplate.java:133)
>  [spring-tx-4.0.3.RELEASE.jar:4.0.3.RELEASE]
>         at 
> com.adaequare.common.config.TransactionPerRequestFilter.doFilter(TransactionPerRequestFilter.java:33)
>  [mapro-commons-1.0.jar:na]
>         at 
> org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1157)
>  [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:388) 
> [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) 
> [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182) 
> [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:765) 
> [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:440) 
> [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
>  [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114)
>  [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) 
> [jetty-6.1.25.jar:6.1.25]
>         at org.mortbay.jetty.Server.handle(Server.java:326) 
> [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:542) 
> [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:943)
>  [jetty-6.1.25.jar:6.1.25]
>         at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:756) 
> [jetty-6.1.25.jar:6.1.25]
>         at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:218) 
> [jetty-6.1.25.jar:6.1.25]
>         at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:404) 
> [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:410) 
> [jetty-6.1.25.jar:6.1.25]
>         at 
> org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:582) 
> [jetty-util-6.1.25.jar:6.1.25]
> Caused by: org.glassfish.jersey.server.ContainerException: 
> java.lang.OutOfMemoryError: Java heap space
>         at 
> org.glassfish.jersey.servlet.internal.ResponseWriter.rethrow(ResponseWriter.java:249)
>  ~[jersey-container-servlet-core-2.6.jar:na]
>         at 
> org.glassfish.jersey.servlet.internal.ResponseWriter.failure(ResponseWriter.java:231)
>  ~[jersey-container-servlet-core-2.6.jar:na]
>         at 
> org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:433)
>  ~[jersey-server-2.6.jar:na]
>         at 
> org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:265) 
> ~[jersey-server-2.6.jar:na]
>         at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) 
> ~[jersey-common-2.6.jar:na]
>         at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) 
> ~[jersey-common-2.6.jar:na]
>         at org.glassfish.jersey.internal.Errors.process(Errors.java:315) 
> ~[jersey-common-2.6.jar:na]
>         at org.glassfish.jersey.internal.Errors.process(Errors.java:297) 
> ~[jersey-common-2.6.jar:na]
>         at org.glassfish.jersey.internal.Errors.process(Errors.java:267) 
> ~[jersey-common-2.6.jar:na]
>         at 
> org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:319)
>  ~[jersey-common-2.6.jar:na]
>         at 
> org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:236) 
> ~[jersey-server-2.6.jar:na]
>         at 
> org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1028)
>  ~[jersey-server-2.6.jar:na]
>         at 
> org.glassfish.jersey.servlet.WebComponent.service(WebComponent.java:373) 
> ~[jersey-container-servlet-core-2.6.jar:na]
>         ... 26 common frames omitted
> Caused by: java.lang.OutOfMemoryError: Java heap space
> {code}
> Here are my GRADLE_OPTS: -Xms1024m -Xmx1024m -XX:MaxPermSize=256m -Xdebug 
> -Xrunjdwp:transport=dt_socket,address=9999,server=y,suspend=n
> And my csv file hardly contains 2 lines of data. Not sure what is causing OOM 
> here? Is there some problem with the latest code I have taken or am I missing 
> something here?



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to