[jira] [Commented] (CASSANDRA-13269) Snapshot support for custom secondary indices
[ https://issues.apache.org/jira/browse/CASSANDRA-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884575#comment-15884575 ] Jeff Jirsa commented on CASSANDRA-13269: Any of you other custom-secondary-index folks ( [~jjordan] / [~iamaleksey] , or [~adelapena] ) eager to review? > Snapshot support for custom secondary indices > - > > Key: CASSANDRA-13269 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13269 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: vincent royer >Priority: Trivial > Labels: features > Fix For: 3.0.12, 3.11.0 > > Attachments: 0001-CASSANDRA-13269-custom-indices-snapshot.patch > > > Enhance the index API to support snapshot of custom secondary indices. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13270) Add function hooks to deliver Elasticsearch as a Cassandra plugin
[ https://issues.apache.org/jira/browse/CASSANDRA-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-13270: -- Attachment: 0001-CASSANDRA-13270-elasticsearch-as-a-plugin.patch > Add function hooks to deliver Elasticsearch as a Cassandra plugin > - > > Key: CASSANDRA-13270 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13270 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: vincent royer >Priority: Minor > Labels: features > Fix For: 3.0.12, 3.11.0 > > Attachments: 0001-CASSANDRA-13270-elasticsearch-as-a-plugin.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > With these basic modifications (see the patch) and the following tickets, the > Elassandra project (see https://github.com/strapdata/elassandra) could be an > Elasticsearch plugin for Cassandra. > * CASSANDRA-12837 Add multi-threaded support to nodetool rebuild_index. > * CASSANDRA-13267 Add CQL functions. > * CASSANDRA-13268 Allow to create custom secondary index on static columns. > * CASSANDRA-13269 Snapshot support for custom secondary indices -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13270) Add function hooks to deliver Elasticsearch as a Cassandra plugin
[ https://issues.apache.org/jira/browse/CASSANDRA-13270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-13270: -- Reproduced In: 3.0.11 Status: Patch Available (was: Open) See attached patch > Add function hooks to deliver Elasticsearch as a Cassandra plugin > - > > Key: CASSANDRA-13270 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13270 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: vincent royer >Priority: Minor > Labels: features > Fix For: 3.0.12, 3.11.0 > > Original Estimate: 1h > Remaining Estimate: 1h > > With these basic modifications (see the patch) and the following tickets, the > Elassandra project (see https://github.com/strapdata/elassandra) could be an > Elasticsearch plugin for Cassandra. > * CASSANDRA-12837 Add multi-threaded support to nodetool rebuild_index. > * CASSANDRA-13267 Add CQL functions. > * CASSANDRA-13268 Allow to create custom secondary index on static columns. > * CASSANDRA-13269 Snapshot support for custom secondary indices -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13270) Add function hooks to deliver Elasticsearch as a Cassandra plugin
vincent royer created CASSANDRA-13270: - Summary: Add function hooks to deliver Elasticsearch as a Cassandra plugin Key: CASSANDRA-13270 URL: https://issues.apache.org/jira/browse/CASSANDRA-13270 Project: Cassandra Issue Type: Improvement Components: Core Reporter: vincent royer Priority: Minor Fix For: 3.0.12, 3.11.0 With these basic modifications (see the patch) and the following tickets, the Elassandra project (see https://github.com/strapdata/elassandra) could be an Elasticsearch plugin for Cassandra. * CASSANDRA-12837 Add multi-threaded support to nodetool rebuild_index. * CASSANDRA-13267 Add CQL functions. * CASSANDRA-13268 Allow to create custom secondary index on static columns. * CASSANDRA-13269 Snapshot support for custom secondary indices -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13269) Snapshot support for custom secondary indices
[ https://issues.apache.org/jira/browse/CASSANDRA-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-13269: -- Attachment: 0001-CASSANDRA-13269-custom-indices-snapshot.patch > Snapshot support for custom secondary indices > - > > Key: CASSANDRA-13269 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13269 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: vincent royer >Priority: Trivial > Labels: features > Fix For: 3.0.12, 3.11.0 > > Attachments: 0001-CASSANDRA-13269-custom-indices-snapshot.patch > > > Enhance the index API to support snapshot of custom secondary indices. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13269) Snapshot support for custom secondary indices
[ https://issues.apache.org/jira/browse/CASSANDRA-13269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-13269: -- Labels: features (was: ) Status: Patch Available (was: Open) Here is an implementation to snapshot custom secondary indices when snapshotting SSTables. With this feature, Elassandra is already able to make consistent snapshots of both SSTables and Elasticsearch lucene files. > Snapshot support for custom secondary indices > - > > Key: CASSANDRA-13269 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13269 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: vincent royer >Priority: Trivial > Labels: features > Fix For: 3.0.12, 3.11.0 > > > Enhance the index API to support snapshot of custom secondary indices. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13269) Snapshot support for custom secondary indices
vincent royer created CASSANDRA-13269: - Summary: Snapshot support for custom secondary indices Key: CASSANDRA-13269 URL: https://issues.apache.org/jira/browse/CASSANDRA-13269 Project: Cassandra Issue Type: Improvement Components: Core Reporter: vincent royer Priority: Trivial Fix For: 3.0.12, 3.11.0 Enhance the index API to support snapshot of custom secondary indices. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13268) Allow to create custom secondary index on static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-13268: -- Labels: features (was: ) Reproduced In: 3.0.10 Status: Patch Available (was: Open) This patch allow to create a custom secondary index on a static column with an option as follow : CREATE TABLE test.t2 ( a int, b text, c text static, PRIMARY KEY (a, b) ); CREATE CUSTOM INDEX my_idx ON test.t2 (c) USING 'a.class' WITH OPTIONS = {'enforce': 'true'}; > Allow to create custom secondary index on static columns > > > Key: CASSANDRA-13268 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13268 > Project: Cassandra > Issue Type: Improvement > Components: Core, CQL >Reporter: vincent royer >Priority: Trivial > Labels: features > Fix For: 3.0.x > > Attachments: 0001-CASSANDRA-13268-custom-index-on-static-columns.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > Custom secondary index implementations (like elassandra) could gain avantage > to index static columns, even if not searchable with CQL. Here is a proposal > to allow index creation on static columns. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13268) Allow to create custom secondary index on static columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-13268: -- Attachment: 0001-CASSANDRA-13268-custom-index-on-static-columns.patch > Allow to create custom secondary index on static columns > > > Key: CASSANDRA-13268 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13268 > Project: Cassandra > Issue Type: Improvement > Components: Core, CQL >Reporter: vincent royer >Priority: Trivial > Labels: features > Fix For: 3.0.x > > Attachments: 0001-CASSANDRA-13268-custom-index-on-static-columns.patch > > Original Estimate: 1h > Remaining Estimate: 1h > > Custom secondary index implementations (like elassandra) could gain avantage > to index static columns, even if not searchable with CQL. Here is a proposal > to allow index creation on static columns. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13268) Allow to create custom secondary index on static columns
vincent royer created CASSANDRA-13268: - Summary: Allow to create custom secondary index on static columns Key: CASSANDRA-13268 URL: https://issues.apache.org/jira/browse/CASSANDRA-13268 Project: Cassandra Issue Type: Improvement Components: Core, CQL Reporter: vincent royer Priority: Trivial Fix For: 3.0.x Custom secondary index implementations (like elassandra) could gain avantage to index static columns, even if not searchable with CQL. Here is a proposal to allow index creation on static columns. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13267) Add new CQL functions
[ https://issues.apache.org/jira/browse/CASSANDRA-13267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-13267: -- Attachment: 0001-CASSANDRA-13267-Add-CQL-functions.patch > Add new CQL functions > - > > Key: CASSANDRA-13267 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13267 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: vincent royer >Priority: Trivial > Labels: features > Fix For: 3.0.x > > Attachments: 0001-CASSANDRA-13267-Add-CQL-functions.patch > > > Introduce 2 new CQL functions : > -toString(x) converts a column to its string representation. > -toJsonArray(x, y, z...) generates a JSON array of JSON string. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13267) Add new CQL functions
[ https://issues.apache.org/jira/browse/CASSANDRA-13267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-13267: -- Labels: features (was: ) Fix Version/s: (was: 3.11.x) 3.0.x Reproduced In: 3.0.11 Status: Patch Available (was: Open) > Add new CQL functions > - > > Key: CASSANDRA-13267 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13267 > Project: Cassandra > Issue Type: Improvement > Components: CQL >Reporter: vincent royer >Priority: Trivial > Labels: features > Fix For: 3.0.x > > > Introduce 2 new CQL functions : > -toString(x) converts a column to its string representation. > -toJsonArray(x, y, z...) generates a JSON array of JSON string. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13267) Add new CQL functions
vincent royer created CASSANDRA-13267: - Summary: Add new CQL functions Key: CASSANDRA-13267 URL: https://issues.apache.org/jira/browse/CASSANDRA-13267 Project: Cassandra Issue Type: Improvement Components: CQL Reporter: vincent royer Priority: Trivial Fix For: 3.11.x Introduce 2 new CQL functions : -toString(x) converts a column to its string representation. -toJsonArray(x, y, z...) generates a JSON array of JSON string. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12837) Add multi-threaded support to nodetool rebuild_index
[ https://issues.apache.org/jira/browse/CASSANDRA-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-12837: -- Attachment: 0001-CASSANDRA-12837-multi-threaded-rebuild_index.patch Patch for Cassandra 3.0.11 (successfully tested on cassandra 2.2 and 3.0.x) > Add multi-threaded support to nodetool rebuild_index > > > Key: CASSANDRA-12837 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12837 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: vincent royer >Priority: Minor > Labels: patch, performance > Fix For: 3.0.12, 4.x > > Attachments: 0001-CASSANDRA-12837-multi-threaded-rebuild_index.patch, > CASSANDRA-12837-2.2.9.txt > > > Add multi-thread nodetool rebuild_index to improve performances. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12837) Add multi-threaded support to nodetool rebuild_index
[ https://issues.apache.org/jira/browse/CASSANDRA-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-12837: -- Labels: patch performance (was: patch) > Add multi-threaded support to nodetool rebuild_index > > > Key: CASSANDRA-12837 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12837 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: vincent royer >Priority: Minor > Labels: patch, performance > Fix For: 3.0.12, 4.x > > Attachments: CASSANDRA-12837-2.2.9.txt > > > Add multi-thread nodetool rebuild_index to improve performances. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12837) Add multi-threaded support to nodetool rebuild_index
[ https://issues.apache.org/jira/browse/CASSANDRA-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-12837: -- Reproduced In: 3.0.11, 2.2.x (was: 2.2.x) > Add multi-threaded support to nodetool rebuild_index > > > Key: CASSANDRA-12837 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12837 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: vincent royer >Priority: Minor > Labels: patch > Fix For: 3.0.12, 4.x > > Attachments: CASSANDRA-12837-2.2.9.txt > > > Add multi-thread nodetool rebuild_index to improve performances. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12837) Add multi-threaded support to nodetool rebuild_index
[ https://issues.apache.org/jira/browse/CASSANDRA-12837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vincent royer updated CASSANDRA-12837: -- Fix Version/s: 3.0.12 > Add multi-threaded support to nodetool rebuild_index > > > Key: CASSANDRA-12837 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12837 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: vincent royer >Priority: Minor > Labels: patch > Fix For: 3.0.12, 4.x > > Attachments: CASSANDRA-12837-2.2.9.txt > > > Add multi-thread nodetool rebuild_index to improve performances. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-9989) Optimise BTree.Buider
[ https://issues.apache.org/jira/browse/CASSANDRA-9989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884366#comment-15884366 ] Jay Zhuang commented on CASSANDRA-9989: --- [~slebresne] Would you please review this [patch|https://github.com/cooldoger/cassandra/commit/bf6bc14a130dae64cb859e81ad54b21d5434d46a]? > Optimise BTree.Buider > - > > Key: CASSANDRA-9989 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9989 > Project: Cassandra > Issue Type: Sub-task >Reporter: Benedict >Assignee: Jay Zhuang >Priority: Minor > Fix For: 4.x > > Attachments: 9989-trunk.txt > > > BTree.Builder could reduce its copying, and exploit toArray more efficiently, > with some work. It's not very important right now because we don't make as > much use of its bulk-add methods as we otherwise might, however over time > this work will become more useful. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13266) Bulk loading sometimes is very slow?
[ https://issues.apache.org/jira/browse/CASSANDRA-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liangsibin updated CASSANDRA-13266: --- Description: When I bulkload sstable created with CQLSSTableWriter, it's sometimes very slow. CQLSSTableWriter withBufferSizeInMB 32MB use 2 nodes write SSTable and bulkload 1、Use CQLSSTableWriter create SSTable (60 threads) 2、When the directory over 10 rows,bulkload the directory (20 threads) the normal bulkload speed is about 70M/s per node,and bulkload 141G SStables per node cost 90 minutes but sometimes is very slow,the same data cost 4 hours why? here is the code bulkload sstable {code:java} public class JmxBulkLoader { static final Logger LOGGER = LoggerFactory.getLogger(JmxBulkLoader.class); private JMXConnector connector; private StorageServiceMBean storageBean; private Timer timer = new Timer(); public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL( String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port)); Mapenv = new HashMap (); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService"); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void close() throws IOException { connector.close(); } public void bulkLoad(String path) { LOGGER.info("begin load data to cassandra " + new Path(path).getName()); timer.start(); storageBean.bulkLoad(path); timer.end(); LOGGER.info("bulk load took " + timer.getTimeTakenMillis() + "ms, path: " + new Path(path).getName()); } } {code} bulkload thread {code:java} public class BulkThread implements Runnable { private String path; private String jmxHost; private int jmxPort; public BulkThread(String path, String jmxHost, int jmxPort) { super(); this.path = path; this.jmxHost = jmxHost; this.jmxPort = jmxPort; } @Override public void run() { JmxBulkLoader bulkLoader = null; try { bulkLoader = new JmxBulkLoader(jmxHost, jmxPort); bulkLoader.bulkLoad(path); } catch (Exception e) { e.printStackTrace(); } finally { if (bulkLoader != null) try { bulkLoader.close(); bulkLoader = null; } catch (IOException e) { e.printStackTrace(); } } } } {code} was: When I bulkload sstable created with CQLSSTableWriter, it's sometimes very slow. use 2 nodes write SSTable and bulkload 1、Use CQLSSTableWriter create SSTable (60 threads) 2、When the directory over 10 rows,bulkload the directory (20 threads) the normal bulkload speed is about 70M/s per node,and bulkload 141G SStables per node cost 90 minutes but sometimes is very slow,the same data cost 4 hours why? here is the code bulkload sstable {code:java} public class JmxBulkLoader { static final Logger LOGGER = LoggerFactory.getLogger(JmxBulkLoader.class); private JMXConnector connector; private StorageServiceMBean storageBean; private Timer timer = new Timer(); public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL( String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port)); Map env = new HashMap (); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService"); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void
[jira] [Updated] (CASSANDRA-13266) Bulk loading sometimes is very slow?
[ https://issues.apache.org/jira/browse/CASSANDRA-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liangsibin updated CASSANDRA-13266: --- Description: When I bulkload sstable created with CQLSSTableWriter, it's sometimes very slow. use 2 nodes write SSTable and bulkload 1、Use CQLSSTableWriter create SSTable (60 threads) 2、When the directory over 10 rows,bulkload the directory (20 threads) the normal bulkload speed is about 70M/s per node,and bulkload 141G SStables per node cost 90 minutes but sometimes is very slow,the same data cost 4 hours why? here is the code bulkload sstable {code:java} public class JmxBulkLoader { static final Logger LOGGER = LoggerFactory.getLogger(JmxBulkLoader.class); private JMXConnector connector; private StorageServiceMBean storageBean; private Timer timer = new Timer(); public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL( String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port)); Mapenv = new HashMap (); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService"); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void close() throws IOException { connector.close(); } public void bulkLoad(String path) { LOGGER.info("begin load data to cassandra " + new Path(path).getName()); timer.start(); storageBean.bulkLoad(path); timer.end(); LOGGER.info("bulk load took " + timer.getTimeTakenMillis() + "ms, path: " + new Path(path).getName()); } } {code} bulkload thread {code:java} public class BulkThread implements Runnable { private String path; private String jmxHost; private int jmxPort; public BulkThread(String path, String jmxHost, int jmxPort) { super(); this.path = path; this.jmxHost = jmxHost; this.jmxPort = jmxPort; } @Override public void run() { JmxBulkLoader bulkLoader = null; try { bulkLoader = new JmxBulkLoader(jmxHost, jmxPort); bulkLoader.bulkLoad(path); } catch (Exception e) { e.printStackTrace(); } finally { if (bulkLoader != null) try { bulkLoader.close(); bulkLoader = null; } catch (IOException e) { e.printStackTrace(); } } } } {code} was: When I bulkload sstable created with CQLSSTableWriter, it's sometimes very slow. use 2 nodes write SSTable and bulkload 1、Use CQLSSTableWriter create SSTable (60 threads) 2、When the directory over 10 rows,bulkload the directory (20 threads) the normal bulkload speed is about 70M/s per node,and bulkload 141G SStables cost 90 minutes but sometimes is very slow,the same data cost 4 hours why? here is the code bulkload sstable {code:java} public class JmxBulkLoader { static final Logger LOGGER = LoggerFactory.getLogger(JmxBulkLoader.class); private JMXConnector connector; private StorageServiceMBean storageBean; private Timer timer = new Timer(); public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL( String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port)); Map env = new HashMap (); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService"); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void close() throws IOException {
[jira] [Updated] (CASSANDRA-13266) Bulk loading sometimes is very slow?
[ https://issues.apache.org/jira/browse/CASSANDRA-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liangsibin updated CASSANDRA-13266: --- Description: When I bulkload sstable created with CQLSSTableWriter, it's sometimes very slow. use 2 nodes write SSTable and bulkload 1、Use CQLSSTableWriter create SSTable (60 threads) 2、When the directory over 10 rows,bulkload the directory (20 threads) the normal bulkload speed is about 70M/s per node,and bulkload 141G SStables cost 90 minutes but sometimes is very slow,the same data cost 4 hours why? here is the code bulkload sstable {code:java} public class JmxBulkLoader { static final Logger LOGGER = LoggerFactory.getLogger(JmxBulkLoader.class); private JMXConnector connector; private StorageServiceMBean storageBean; private Timer timer = new Timer(); public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL( String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port)); Mapenv = new HashMap (); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService"); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void close() throws IOException { connector.close(); } public void bulkLoad(String path) { LOGGER.info("begin load data to cassandra " + new Path(path).getName()); timer.start(); storageBean.bulkLoad(path); timer.end(); LOGGER.info("bulk load took " + timer.getTimeTakenMillis() + "ms, path: " + new Path(path).getName()); } } {code} bulkload thread {code:java} public class BulkThread implements Runnable { private String path; private String jmxHost; private int jmxPort; public BulkThread(String path, String jmxHost, int jmxPort) { super(); this.path = path; this.jmxHost = jmxHost; this.jmxPort = jmxPort; } @Override public void run() { JmxBulkLoader bulkLoader = null; try { bulkLoader = new JmxBulkLoader(jmxHost, jmxPort); bulkLoader.bulkLoad(path); } catch (Exception e) { e.printStackTrace(); } finally { if (bulkLoader != null) try { bulkLoader.close(); bulkLoader = null; } catch (IOException e) { e.printStackTrace(); } } } } {code} was: When I bulkload sstable created with CQLSSTableWriter, it's sometimes very slow. use 2 nodes write SSTable and bulkload 1、Use CQLSSTableWriter create SSTable (60 threads) 2、When the directory over 10 rows,bulkload the directory (20 threads) the normal bulkload speed is about 70M/s per node,and bulkload 141G SStables cost 90 minutes but sometimes is very slow,the same data cost 4 hours why? here is the code bulkload sstable |public class JmxBulkLoader { static final Logger LOGGER = LoggerFactory.getLogger(JmxBulkLoader.class); private JMXConnector connector; private StorageServiceMBean storageBean; private Timer timer = new Timer(); public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL( String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port)); Map env = new HashMap (); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService"); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void close() throws IOException { connector.close(); }
[jira] [Updated] (CASSANDRA-13266) Bulk loading sometimes is very slow?
[ https://issues.apache.org/jira/browse/CASSANDRA-13266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] liangsibin updated CASSANDRA-13266: --- Description: When I bulkload sstable created with CQLSSTableWriter, it's sometimes very slow. use 2 nodes write SSTable and bulkload 1、Use CQLSSTableWriter create SSTable (60 threads) 2、When the directory over 10 rows,bulkload the directory (20 threads) the normal bulkload speed is about 70M/s per node,and bulkload 141G SStables cost 90 minutes but sometimes is very slow,the same data cost 4 hours why? here is the code bulkload sstable |public class JmxBulkLoader { static final Logger LOGGER = LoggerFactory.getLogger(JmxBulkLoader.class); private JMXConnector connector; private StorageServiceMBean storageBean; private Timer timer = new Timer(); public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL( String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port)); Mapenv = new HashMap (); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService"); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void close() throws IOException { connector.close(); } public void bulkLoad(String path) { LOGGER.info("begin load data to cassandra " + new Path(path).getName()); timer.start(); storageBean.bulkLoad(path); timer.end(); LOGGER.info("bulk load took " + timer.getTimeTakenMillis() + "ms, path: " + new Path(path).getName()); } } bulkload thread |public class BulkThread implements Runnable { private String path; private String jmxHost; private int jmxPort; public BulkThread(String path, String jmxHost, int jmxPort) { super(); this.path = path; this.jmxHost = jmxHost; this.jmxPort = jmxPort; } @Override public void run() { JmxBulkLoader bulkLoader = null; try { bulkLoader = new JmxBulkLoader(jmxHost, jmxPort); bulkLoader.bulkLoad(path); } catch (Exception e) { e.printStackTrace(); } finally { if (bulkLoader != null) try { bulkLoader.close(); bulkLoader = null; } catch (IOException e) { e.printStackTrace(); } } } } was: When I bulkload sstable created with CQLSSTableWriter, it's sometimes very slow. use 2 nodes write SSTable and bulkload 1、Use CQLSSTableWriter create SSTable (60 threads) 2、When the directory over 10 rows,bulkload the directory (20 threads) the normal bulkload speed is about 70M/s per node,and bulkload 141G SStables cost 90 minutes but sometimes is very slow,the same data cost 4 hours why? here is the code bulkload sstable public class JmxBulkLoader { static final Logger LOGGER = LoggerFactory.getLogger(JmxBulkLoader.class); private JMXConnector connector; private StorageServiceMBean storageBean; private Timer timer = new Timer(); public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL( String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port)); Map env = new HashMap (); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService"); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void close() throws IOException { connector.close(); } public void bulkLoad(String
[jira] [Commented] (CASSANDRA-13259) Use platform specific X.509 default algorithm
[ https://issues.apache.org/jira/browse/CASSANDRA-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884238#comment-15884238 ] Robert Stupp commented on CASSANDRA-13259: -- One minor nit: Can you add an entry to NEWS.txt about this change? > Use platform specific X.509 default algorithm > - > > Key: CASSANDRA-13259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13259 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 4.x > > > We should replace the hardcoded "SunX509" default algorithm and use the JRE > default instead. This implementation will currently not work on less popular > platforms (e.g. IBM) and won't get any further updates. > See also: > https://bugs.openjdk.java.net/browse/JDK-8169745 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CASSANDRA-13266) Bulk loading sometimes is very slow?
liangsibin created CASSANDRA-13266: -- Summary: Bulk loading sometimes is very slow? Key: CASSANDRA-13266 URL: https://issues.apache.org/jira/browse/CASSANDRA-13266 Project: Cassandra Issue Type: Improvement Reporter: liangsibin When I bulkload sstable created with CQLSSTableWriter, it's sometimes very slow. use 2 nodes write SSTable and bulkload 1、Use CQLSSTableWriter create SSTable (60 threads) 2、When the directory over 10 rows,bulkload the directory (20 threads) the normal bulkload speed is about 70M/s per node,and bulkload 141G SStables cost 90 minutes but sometimes is very slow,the same data cost 4 hours why? here is the code bulkload sstable public class JmxBulkLoader { static final Logger LOGGER = LoggerFactory.getLogger(JmxBulkLoader.class); private JMXConnector connector; private StorageServiceMBean storageBean; private Timer timer = new Timer(); public JmxBulkLoader(String host, int port) throws Exception { connect(host, port); } private void connect(String host, int port) throws IOException, MalformedObjectNameException { JMXServiceURL jmxUrl = new JMXServiceURL( String.format("service:jmx:rmi:///jndi/rmi://%s:%d/jmxrmi", host, port)); Mapenv = new HashMap (); connector = JMXConnectorFactory.connect(jmxUrl, env); MBeanServerConnection mbeanServerConn = connector.getMBeanServerConnection(); ObjectName name = new ObjectName("org.apache.cassandra.db:type=StorageService"); storageBean = JMX.newMBeanProxy(mbeanServerConn, name, StorageServiceMBean.class); } public void close() throws IOException { connector.close(); } public void bulkLoad(String path) { LOGGER.info("begin load data to cassandra " + new Path(path).getName()); timer.start(); storageBean.bulkLoad(path); timer.end(); LOGGER.info("bulk load took " + timer.getTimeTakenMillis() + "ms, path: " + new Path(path).getName()); } } bulkload thread public class BulkThread implements Runnable { private String path; private String jmxHost; private int jmxPort; public BulkThread(String path, String jmxHost, int jmxPort) { super(); this.path = path; this.jmxHost = jmxHost; this.jmxPort = jmxPort; } @Override public void run() { JmxBulkLoader bulkLoader = null; try { bulkLoader = new JmxBulkLoader(jmxHost, jmxPort); bulkLoader.bulkLoad(path); } catch (Exception e) { e.printStackTrace(); } finally { if (bulkLoader != null) try { bulkLoader.close(); bulkLoader = null; } catch (IOException e) { e.printStackTrace(); } } } } -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13259) Use platform specific X.509 default algorithm
[ https://issues.apache.org/jira/browse/CASSANDRA-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Brown updated CASSANDRA-13259: Status: Ready to Commit (was: Patch Available) > Use platform specific X.509 default algorithm > - > > Key: CASSANDRA-13259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13259 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 4.x > > > We should replace the hardcoded "SunX509" default algorithm and use the JRE > default instead. This implementation will currently not work on less popular > platforms (e.g. IBM) and won't get any further updates. > See also: > https://bugs.openjdk.java.net/browse/JDK-8169745 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13259) Use platform specific X.509 default algorithm
[ https://issues.apache.org/jira/browse/CASSANDRA-13259?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884237#comment-15884237 ] Jason Brown commented on CASSANDRA-13259: - +1 > Use platform specific X.509 default algorithm > - > > Key: CASSANDRA-13259 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13259 > Project: Cassandra > Issue Type: Improvement > Components: Configuration >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 4.x > > > We should replace the hardcoded "SunX509" default algorithm and use the JRE > default instead. This implementation will currently not work on less popular > platforms (e.g. IBM) and won't get any further updates. > See also: > https://bugs.openjdk.java.net/browse/JDK-8169745 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12653) In-flight shadow round requests
[ https://issues.apache.org/jira/browse/CASSANDRA-12653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15884235#comment-15884235 ] Jason Brown commented on CASSANDRA-12653: - [~spod] I'm on board with {{firstSynSendAt}} being a timestamp, I'm just questioning what you compare it against. Instead of this: {code} long ts = epStateMap.values().iterator().next().getUpdateTimestamp(); if ((ts - Gossiper.instance.firstSynSendAt) < 0 || Gossiper.instance.firstSynSendAt == 0) {code} I'm suggesting this: {code} if ((System.nanoTime() - Gossiper.instance.firstSynSendAt) < 0 || Gossiper.instance.firstSynSendAt == 0) {code} It amounts to the same thing (a check against {{System.nanoTime()}}), but it's more obvious where the value comes from. [~jkni] wrt to {{synchronized}}, on one hand agree with you, but then that makes the argument for making many more of the methods on {{Gossiper}} synchronized. If we're going to make this method different from the rest (which is fine with me), can we add in a relevant, detailed comment why it is? > In-flight shadow round requests > --- > > Key: CASSANDRA-12653 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12653 > Project: Cassandra > Issue Type: Bug > Components: Distributed Metadata >Reporter: Stefan Podkowinski >Assignee: Stefan Podkowinski >Priority: Minor > Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x > > > Bootstrapping or replacing a node in the cluster requires to gather and check > some host IDs or tokens by doing a gossip "shadow round" once before joining > the cluster. This is done by sending a gossip SYN to all seeds until we > receive a response with the cluster state, from where we can move on in the > bootstrap process. Receiving a response will call the shadow round done and > calls {{Gossiper.resetEndpointStateMap}} for cleaning up the received state > again. > The issue here is that at this point there might be other in-flight requests > and it's very likely that shadow round responses from other seeds will be > received afterwards, while the current state of the bootstrap process doesn't > expect this to happen (e.g. gossiper may or may not be enabled). > One side effect will be that MigrationTasks are spawned for each shadow round > reply except the first. Tasks might or might not execute based on whether at > execution time {{Gossiper.resetEndpointStateMap}} had been called, which > effects the outcome of {{FailureDetector.instance.isAlive(endpoint))}} at > start of the task. You'll see error log messages such as follows when this > happend: > {noformat} > INFO [SharedPool-Worker-1] 2016-09-08 08:36:39,255 Gossiper.java:993 - > InetAddress /xx.xx.xx.xx is now UP > ERROR [MigrationStage:1]2016-09-08 08:36:39,255 FailureDetector.java:223 > - unknown endpoint /xx.xx.xx.xx > {noformat} > Although is isn't pretty, I currently don't see any serious harm from this, > but it would be good to get a second opinion (feel free to close as "wont > fix"). > /cc [~Stefania] [~thobbs] -- This message was sent by Atlassian JIRA (v6.3.15#6346)