[jira] [Commented] (CASSANDRA-3124) java heap limit for nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097034#comment-13097034 ] Zenek Kraweznik commented on CASSANDRA-3124: Oh, and 1 important thing: I haven't change any default java limit in java config, I've modyfied only cassandra-env.sh java heap limit for nodetool Key: CASSANDRA-3124 URL: https://issues.apache.org/jira/browse/CASSANDRA-3124 Project: Cassandra Issue Type: Improvement Components: Core, Tools Affects Versions: 0.8.1, 0.8.2, 0.8.3, 0.8.4 Environment: not important Reporter: Zenek Kraweznik Priority: Minor by defaull (from debian package) # nodetool Error occurred during initialization of VM Could not reserve enough space for object heap Could not create the Java virtual machine. # and: --- /usr/bin/nodetool.old 2011-09-02 14:15:14.228152799 +0200 +++ /usr/bin/nodetool 2011-09-02 14:14:28.745154552 +0200 @@ -55,7 +55,7 @@ ;; esac -$JAVA -cp $CLASSPATH -Dstorage-config=$CASSANDRA_CONF \ +$JAVA -Xmx32m -cp $CLASSPATH -Dstorage-config=$CASSANDRA_CONF \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.NodeCmd $@ after every upgrade i had to add limit manually. I think it's good idea to add it by default ;) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3133) nodetool netstats doesn't show streams during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097036#comment-13097036 ] Zenek Kraweznik commented on CASSANDRA-3133: When I type nodetool -h $IP netstats or nodetool -h $IP streams I see only: Nothing streaming to /10.117.199.232 I also think domission of node will not end. # nodetool -h 10.10.10.11 ring Address DC RackStatus State LoadOwns Token 10.10.10.11 datacenter1 rack1 Up Normal 193.5 GB25.00% 0 10.10.10.12 datacenter1 rack1 Up Normal 252.07 GB 33.33% 56713727820156410577229101238628035242 10.10.10.13 datacenter1 rack1 Up Normal 188.63 GB 33.33% 113427455640312821154458202477256070485 10.10.10.14 datacenter1 rack1 Up Leaving 141.97 GB 8.33% 127605887595351923798765477786913079296 4th host is in leaving state from 72h,. # nodetool -h 10.10.10.14 decommission ; echo END nodetool process is still runing, END was not printed yet. 8GB RAM is free (25%, servers are 32GB), cpu is not loaded, there is a lot of free disk space. Networ traffic is about 25Kbit/s, but links are 1Gbps. nodetool netstats doesn't show streams during decommission -- Key: CASSANDRA-3133 URL: https://issues.apache.org/jira/browse/CASSANDRA-3133 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.8.4 Environment: debian 6.0.2.1 (squeeze), java 1.6.26 (sun, non-free packages). Reporter: Zenek Kraweznik nodetool netstats is now showing transferred files from demonission -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3136) Allow CFIF to keep going despite unavailable ranges
Allow CFIF to keep going despite unavailable ranges --- Key: CASSANDRA-3136 URL: https://issues.apache.org/jira/browse/CASSANDRA-3136 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Mck SembWever Priority: Minor From http://thread.gmane.org/gmane.comp.db.cassandra.user/18902 use-case-1 We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute lexical and thematical data for each page and for exporting the data to a binary files for later use. URL gets to a Cassandra on user request (a pageview) so if we delete an URL, it gets back quickly if the page is active. Because of that and because there is lots of data, we have the keyspace set to RF=1. We can drop the whole keyspace and it will regenerate quickly and would contain only fresh data, so we don't care about lossing a node. /use-case-1 use-case-2 trying to extract a small random sample (like a pig SAMPLE) of data out of cassandra. /use-case-2 use-case-3 searching for something or some-pattern and one hit is enough. If you get the hit it's a positive result regardless if ranges were ignored, if you don't and you *know* there was a range ignored along the way you can re-run the job later. For example such a job could be run at regular intervals in the day until a hit was found. /use-case-3 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3136) Allow CFIF to keep going despite unavailable ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mck SembWever updated CASSANDRA-3136: - Description: From http://thread.gmane.org/gmane.comp.db.cassandra.user/18902 use-case-1 from=Patrik Modesto We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute lexical and thematical data for each page and for exporting the data to a binary files for later use. URL gets to a Cassandra on user request (a pageview) so if we delete an URL, it gets back quickly if the page is active. Because of that and because there is lots of data, we have the keyspace set to RF=1. We can drop the whole keyspace and it will regenerate quickly and would contain only fresh data, so we don't care about lossing a node. /use-case-1 use-case-2 trying to extract a small random sample (like a pig SAMPLE) of data out of cassandra. /use-case-2 use-case-3 searching for something or some-pattern and one hit is enough. If you get the hit it's a positive result regardless if ranges were ignored, if you don't and you *know* there was a range ignored along the way you can re-run the job later. For example such a job could be run at regular intervals in the day until a hit was found. /use-case-3 was: From http://thread.gmane.org/gmane.comp.db.cassandra.user/18902 use-case-1 We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute lexical and thematical data for each page and for exporting the data to a binary files for later use. URL gets to a Cassandra on user request (a pageview) so if we delete an URL, it gets back quickly if the page is active. Because of that and because there is lots of data, we have the keyspace set to RF=1. We can drop the whole keyspace and it will regenerate quickly and would contain only fresh data, so we don't care about lossing a node. /use-case-1 use-case-2 trying to extract a small random sample (like a pig SAMPLE) of data out of cassandra. /use-case-2 use-case-3 searching for something or some-pattern and one hit is enough. If you get the hit it's a positive result regardless if ranges were ignored, if you don't and you *know* there was a range ignored along the way you can re-run the job later. For example such a job could be run at regular intervals in the day until a hit was found. /use-case-3 Allow CFIF to keep going despite unavailable ranges --- Key: CASSANDRA-3136 URL: https://issues.apache.org/jira/browse/CASSANDRA-3136 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Mck SembWever Priority: Minor From http://thread.gmane.org/gmane.comp.db.cassandra.user/18902 use-case-1 from=Patrik Modesto We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute lexical and thematical data for each page and for exporting the data to a binary files for later use. URL gets to a Cassandra on user request (a pageview) so if we delete an URL, it gets back quickly if the page is active. Because of that and because there is lots of data, we have the keyspace set to RF=1. We can drop the whole keyspace and it will regenerate quickly and would contain only fresh data, so we don't care about lossing a node. /use-case-1 use-case-2 trying to extract a small random sample (like a pig SAMPLE) of data out of cassandra. /use-case-2 use-case-3 searching for something or some-pattern and one hit is enough. If you get the hit it's a positive result regardless if ranges were ignored, if you don't and you *know* there was a range ignored along the way you can re-run the job later. For example such a job could be run at regular intervals in the day until a hit was found. /use-case-3 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3031) Add 4 byte integer type
[ https://issues.apache.org/jira/browse/CASSANDRA-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097045#comment-13097045 ] Eric Evans commented on CASSANDRA-3031: --- {quote} I think that you are worried too much about backward CQL compatibility. Cassandra clusters are operated by responsible persons and It do not break existing clusters. Fixing schema creating CQL scripts is trivial (if they exists). {quote} It's not idle concern, it's a reaction. {quote} Lets say we change CQL int from int8 to int4. It will create new cluster with unexpected schema, but application will get an exception on first insert, you cant validate int8 in int4 field. Admin can fix schema via cassandra-cli or fix CQL script. Its fail fast scenario. {quote} ... and update any code that makes assumptions about the length returned, etc, etc. Cassandra's next release will be the coveted 1.0, what that means precisely is up to debate, but I think everyone is in agreement that it communicates We're All Grown Up. For me, that means we're past the point were we can realistically suggest that people smoke test, then pick up any the broken pieces. Add 4 byte integer type --- Key: CASSANDRA-3031 URL: https://issues.apache.org/jira/browse/CASSANDRA-3031 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.4 Environment: any Reporter: Radim Kolar Priority: Minor Labels: hector, lhf Fix For: 1.0 Attachments: apache-cassandra-0.8.4-SNAPSHOT.jar, src.diff, test.diff Cassandra currently lacks support for 4byte fixed size integer data type. Java API Hector and C libcassandra likes to serialize integers as 4 bytes in network order. Problem is that you cant use cassandra-cli to manipulate stored rows. Compatibility with other applications using api following cassandra integer encoding standard is problematic too. Because adding new datatype/validator is fairly simple I recommend to add int4 data type. Compatibility with hector is important because it is most used Java cassandra api and lot of applications are using it. This problem was discussed several times already http://comments.gmane.org/gmane.comp.db.hector.user/2125 https://issues.apache.org/jira/browse/CASSANDRA-2585 It would be nice to have compatibility with cassandra-cli and other applications without rewriting hector apps. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3031) Add 4 byte integer type
[ https://issues.apache.org/jira/browse/CASSANDRA-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097045#comment-13097045 ] Eric Evans edited comment on CASSANDRA-3031 at 9/5/11 8:17 AM: --- {quote} I think that you are worried too much about backward CQL compatibility. Cassandra clusters are operated by responsible persons and It do not break existing clusters. Fixing schema creating CQL scripts is trivial (if they exists). {quote} It's not idle concern, it's a reaction. {quote} Lets say we change CQL int from int8 to int4. It will create new cluster with unexpected schema, but application will get an exception on first insert, you cant validate int8 in int4 field. Admin can fix schema via cassandra-cli or fix CQL script. Its fail fast scenario. {quote} Cassandra's next release will be the coveted 1.0, what that means precisely is up to debate, but I think everyone is in agreement that it communicates We're All Grown Up. For me, that means we're past the point were we can realistically suggest that people smoke test, then pick up any broken pieces. was (Author: urandom): {quote} I think that you are worried too much about backward CQL compatibility. Cassandra clusters are operated by responsible persons and It do not break existing clusters. Fixing schema creating CQL scripts is trivial (if they exists). {quote} It's not idle concern, it's a reaction. {quote} Lets say we change CQL int from int8 to int4. It will create new cluster with unexpected schema, but application will get an exception on first insert, you cant validate int8 in int4 field. Admin can fix schema via cassandra-cli or fix CQL script. Its fail fast scenario. {quote} ... and update any code that makes assumptions about the length returned, etc, etc. Cassandra's next release will be the coveted 1.0, what that means precisely is up to debate, but I think everyone is in agreement that it communicates We're All Grown Up. For me, that means we're past the point were we can realistically suggest that people smoke test, then pick up any the broken pieces. Add 4 byte integer type --- Key: CASSANDRA-3031 URL: https://issues.apache.org/jira/browse/CASSANDRA-3031 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.4 Environment: any Reporter: Radim Kolar Priority: Minor Labels: hector, lhf Fix For: 1.0 Attachments: apache-cassandra-0.8.4-SNAPSHOT.jar, src.diff, test.diff Cassandra currently lacks support for 4byte fixed size integer data type. Java API Hector and C libcassandra likes to serialize integers as 4 bytes in network order. Problem is that you cant use cassandra-cli to manipulate stored rows. Compatibility with other applications using api following cassandra integer encoding standard is problematic too. Because adding new datatype/validator is fairly simple I recommend to add int4 data type. Compatibility with hector is important because it is most used Java cassandra api and lot of applications are using it. This problem was discussed several times already http://comments.gmane.org/gmane.comp.db.hector.user/2125 https://issues.apache.org/jira/browse/CASSANDRA-2585 It would be nice to have compatibility with cassandra-cli and other applications without rewriting hector apps. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3122) SSTableSimpleUnsortedWriter take long time when inserting big rows
[ https://issues.apache.org/jira/browse/CASSANDRA-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097066#comment-13097066 ] Sylvain Lebresne commented on CASSANDRA-3122: - bq. every time newRow is called, serializedSize iterate through all the columns to compute the size Yes and I agree this ain't the more efficient thing ever, though I would kind of be surprised this would be a bottleneck. Anyway, I don't oppose improving this, but we should create a new ticket for that. bq. An improvement in bulk loading would be to use a single threaded ColumFamily for bulk loading. Yes, but we'll do it in 1.0 only because we have CASSANDRA-2843 there that basically make this trivial, while this is uglier to do without it. SSTableSimpleUnsortedWriter take long time when inserting big rows -- Key: CASSANDRA-3122 URL: https://issues.apache.org/jira/browse/CASSANDRA-3122 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.3 Reporter: Benoit Perroud Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.5 Attachments: 3122.patch, SSTableSimpleUnsortedWriter-v2.patch, SSTableSimpleUnsortedWriter.patch In SSTableSimpleUnsortedWriter, when dealing with rows having a lot of columns, if we call newRow several times (to flush data as soon as possible), the time taken by the newRow() call is increasing non linearly. This is because when newRow is called, we merge the size increasing existing CF with the new one. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3118) nodetool can not decommission a node
[ https://issues.apache.org/jira/browse/CASSANDRA-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097065#comment-13097065 ] Zenek Kraweznik commented on CASSANDRA-3118: I also can't decommission node in 0.8.4. Communication between nodes are fine, cpu and ram utilization is OK (i have a lot of free resources). my nodes are from 1 to 4, i want to disable node 4 (10.10.10.14) Node no 4 is still in state leaving, but all transfer seems to be finished. nodetool can not decommission a node -- Key: CASSANDRA-3118 URL: https://issues.apache.org/jira/browse/CASSANDRA-3118 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.8.4 Environment: Cassandra0.84 Reporter: deng Fix For: 0.8.5 Attachments: 3118-debug.txt when i use nodetool ring and get the result ,and than i want to decommission 100.86.17.90 node ,but i get the error: [root@ip bin]# ./nodetool -h10.86.12.225 ring Address DC RackStatus State LoadOwns Token 154562542458917734942660802527609328132 100.86.17.90 datacenter1 rack1 Up Leaving 1.08 MB 11.21% 3493450320433654773610109291263389161 100.86.12.225datacenter1 rack1 Up Normal 558.25 MB 14.25% 27742979166206700793970535921354744095 100.86.12.224datacenter1 rack1 Up Normal 5.01 GB 6.58% 38945137636148605752956920077679425910 ERROR: root@ip bin]# ./nodetool -h100.86.17.90 decommission Exception in thread main java.lang.UnsupportedOperationException at java.util.AbstractList.remove(AbstractList.java:144) at java.util.AbstractList$Itr.remove(AbstractList.java:360) at java.util.AbstractCollection.removeAll(AbstractCollection.java:337) at org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1041) at org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1006) at org.apache.cassandra.service.StorageService.handleStateLeaving(StorageService.java:877) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:732) at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:839) at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:986) at org.apache.cassandra.service.StorageService.startLeaving(StorageService.java:1836) at org.apache.cassandra.service.StorageService.decommission(StorageService.java:1855) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1426) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1264) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1359) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:305) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at
svn commit: r1165218 - in /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable: SSTableSimpleWriterTest.java SSTableWriterTest.java
Author: slebresne Date: Mon Sep 5 09:23:54 2011 New Revision: 1165218 URL: http://svn.apache.org/viewvc?rev=1165218view=rev Log: Add missing file and fix type from CASSANDRA-3122 commit Added: cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable/SSTableSimpleWriterTest.java Modified: cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java Added: cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable/SSTableSimpleWriterTest.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable/SSTableSimpleWriterTest.java?rev=1165218view=auto == --- cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable/SSTableSimpleWriterTest.java (added) +++ cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable/SSTableSimpleWriterTest.java Mon Sep 5 09:23:54 2011 @@ -0,0 +1,104 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * AS IS BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.cassandra.io.sstable; + +import java.io.File; + +import org.junit.Test; + +import org.apache.cassandra.CleanupHelper; +import org.apache.cassandra.Util; +import org.apache.cassandra.db.*; +import org.apache.cassandra.db.marshal.IntegerType; +import static org.apache.cassandra.utils.ByteBufferUtil.bytes; +import static org.apache.cassandra.utils.ByteBufferUtil.toInt; + +public class SSTableSimpleWriterTest extends CleanupHelper +{ +@Test +public void testSSTableSimpleUnsortedWriter() throws Exception +{ +final int INC = 5; +final int NBCOL = 10; + + +String tablename = Keyspace1; +String cfname = StandardInteger1; + +Table t = Table.open(tablename); // make sure we create the directory +File dir = new File(t.getDataFileLocation(0)); +assert dir.exists(); + +SSTableSimpleUnsortedWriter writer = new SSTableSimpleUnsortedWriter(dir, tablename, cfname, IntegerType.instance, null, 16); + +int k = 0; + +// Adding a few rows first +for (; k 10; ++k) +{ +writer.newRow(bytes(Key + k)); +writer.addColumn(bytes(1), bytes(v), 0); +writer.addColumn(bytes(2), bytes(v), 0); +writer.addColumn(bytes(3), bytes(v), 0); +} + + +// Testing multiple opening of the same row +// We'll write column 0, 5, 10, .., on the first row, then 1, 6, 11, ... on the second one, etc. +for (int i = 0; i INC; ++i) +{ +writer.newRow(bytes(Key + k)); +for (int j = 0; j NBCOL; ++j) +{ +writer.addColumn(bytes(i + INC * j), bytes(v), 1); +} +} +k++; + +// Adding a few more rows +for (; k 20; ++k) +{ +writer.newRow(bytes(Key + k)); +writer.addColumn(bytes(1), bytes(v), 0); +writer.addColumn(bytes(2), bytes(v), 0); +writer.addColumn(bytes(3), bytes(v), 0); +} + +writer.close(); + +// Now add that newly created files to the column family +ColumnFamilyStore cfs = t.getColumnFamilyStore(cfname); +cfs.loadNewSSTables(); + +// Check we get expected results +ColumnFamily cf = Util.getColumnFamily(t, Util.dk(Key10), cfname); +assert cf.getColumnCount() == INC * NBCOL : expecting + (INC * NBCOL) + columns, got + cf.getColumnCount(); +int i = 0; +for (IColumn c : cf) +{ +assert toInt(c.name()) == i : Column name should be + i + , got + toInt(c.name()); +assert c.value().equals(bytes(v)); +assert c.timestamp() == 1; +++i; +} + +cf = Util.getColumnFamily(t, Util.dk(Key19), cfname); +assert cf.getColumnCount() == 3 : expecting 3 columns, got + cf.getColumnCount(); +} +} Modified: cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java URL:
[jira] [Commented] (CASSANDRA-3122) SSTableSimpleUnsortedWriter take long time when inserting big rows
[ https://issues.apache.org/jira/browse/CASSANDRA-3122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097112#comment-13097112 ] Hudson commented on CASSANDRA-3122: --- Integrated in Cassandra-0.8 #313 (See [https://builds.apache.org/job/Cassandra-0.8/313/]) Add missing file and fix type from CASSANDRA-3122 commit slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1165218 Files : * /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable/SSTableSimpleWriterTest.java * /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/io/sstable/SSTableWriterTest.java SSTableSimpleUnsortedWriter take long time when inserting big rows -- Key: CASSANDRA-3122 URL: https://issues.apache.org/jira/browse/CASSANDRA-3122 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.3 Reporter: Benoit Perroud Assignee: Sylvain Lebresne Priority: Minor Fix For: 0.8.5 Attachments: 3122.patch, SSTableSimpleUnsortedWriter-v2.patch, SSTableSimpleUnsortedWriter.patch In SSTableSimpleUnsortedWriter, when dealing with rows having a lot of columns, if we call newRow several times (to flush data as soon as possible), the time taken by the newRow() call is increasing non linearly. This is because when newRow is called, we merge the size increasing existing CF with the new one. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3137) Implement wrapping intersections for ConfigHelper's InputKeyRange
Implement wrapping intersections for ConfigHelper's InputKeyRange - Key: CASSANDRA-3137 URL: https://issues.apache.org/jira/browse/CASSANDRA-3137 Project: Cassandra Issue Type: Improvement Components: Hadoop Affects Versions: 0.8.4 Reporter: Mck SembWever Assignee: Mck SembWever Before there was no support for multiple intersections between the split's range and the job's configured range. After CASSANDRA-3108 it is now possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3108) Make Range and Bounds objects client-safe
[ https://issues.apache.org/jira/browse/CASSANDRA-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097126#comment-13097126 ] Mck SembWever commented on CASSANDRA-3108: -- Didn't see it until now but your patch Jonathan removes the limitation that ConfigHelper's InputKeyRange cannot wrap. I've entered CASSANDRA-3137 to allow wrapping intersections in {{ColumnFamilyInputFormat}}. Make Range and Bounds objects client-safe - Key: CASSANDRA-3108 URL: https://issues.apache.org/jira/browse/CASSANDRA-3108 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Mck SembWever Labels: hadoop Fix For: 0.8.5 Attachments: 3108.txt From Mck's comment on CASSANDRA-1125: Something broke here in production once we went out with 0.8.2. It may have been some poor testing, i'm not entirely sure and a little surprised. CFIF:135 breaks because inside dhtRange.intersects(jobRange) there's a call to new Range(token, token) which calls StorageService.getPartitioner() and StorageService is null as we're not inside the server. A quick fix is to change Range:148 from new Range(token, token) to new Range(token, token, partitioner) making the presumption that the partitioner for the new Range will be the same as this Range. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3137) Implement wrapping intersections for ConfigHelper's InputKeyRange
[ https://issues.apache.org/jira/browse/CASSANDRA-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mck SembWever updated CASSANDRA-3137: - Affects Version/s: (was: 0.8.4) 0.8.5 Implement wrapping intersections for ConfigHelper's InputKeyRange - Key: CASSANDRA-3137 URL: https://issues.apache.org/jira/browse/CASSANDRA-3137 Project: Cassandra Issue Type: Improvement Components: Hadoop Affects Versions: 0.8.5 Reporter: Mck SembWever Assignee: Mck SembWever Before there was no support for multiple intersections between the split's range and the job's configured range. After CASSANDRA-3108 it is now possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3137) Implement wrapping intersections for ConfigHelper's InputKeyRange
[ https://issues.apache.org/jira/browse/CASSANDRA-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mck SembWever updated CASSANDRA-3137: - Attachment: CASSANDRA-3137.patch Haven't tested this (with real data) yet. But the code looks pretty simple and straight forward here... Implement wrapping intersections for ConfigHelper's InputKeyRange - Key: CASSANDRA-3137 URL: https://issues.apache.org/jira/browse/CASSANDRA-3137 Project: Cassandra Issue Type: Improvement Components: Hadoop Affects Versions: 0.8.5 Reporter: Mck SembWever Assignee: Mck SembWever Attachments: CASSANDRA-3137.patch Before there was no support for multiple intersections between the split's range and the job's configured range. After CASSANDRA-3108 it is now possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097159#comment-13097159 ] Sylvain Lebresne commented on CASSANDRA-2474: - I do agree with Eric earlier on, I think this issue could stand being summarized, I'm not too sure I understand what is proposed here so far. So I apologize in advance if it turns out the propositions made above do answer everything that is below. However, it seems that we're focusing on some representation based on materialized views here. Did we focus on that because we consider the basic use cases for composite type, those where we don't use them for materialized view at all, are easy to deal with ? Why not consider composite column name for what they are, *one* column name that is composed of multiple sub-elements ? What I mean here is, I'm not that sure I'm convinced that bq. the original idea from CASSANDRA-2025 of SELECT columnA:x, columnA:y FROM foo WHERE key = 'bar' is the wrong way to go I'm even less convinced when I see the number of comments on this ticket. Again, there seems that the focus was exclusively on materialized views, but I strongly think that composite column names are useful for more than materialized view (I've used composite column names countless time, never for materialized view). But let's take an example of what I mean. Suppose that what you store in your column family are events. Those events arrive with a timestamp whose resolution is maybe the minute (or more precisely, you only care about query them at that precision). Those events have a category (that may have a sorting that make sense), and maybe a subcategory. They also have a unique identifier eventId. Moreover there is a lot of events every minutes and the category/subcategory are not necessarily predefined. The query you want to do are typically: * Give me all the events for time t, category c and sub-category sc. * Give me all the events for time t and category c. * Give me all the events for time t and category c1 to c2 (where c1 c2 for the category sorting) * Give me everything for the last 4 hours Probably most of those would requires paging because there is shit tons of events but still, I want to do those fast. I haven't found a better data model for that kind of example than using a composite column name where the name is (timestamp, category, sub-category, eventId). I haven't found in all the discussion above anything that would allow me to do this better than what is in the initial proposition of CASSANDRA-2025. Now I completely agree that having a good notation to work with materialized view would be great, but IMO if we try to find a syntax that is too far from how composite columns work, I fear we'll end up limiting the usefulness of composite types in CQL to one narrow use case. I'll note too that I haven't seen any proposal of how insertion with compound types should look like. CQL support for compound columns Key: CASSANDRA-2474 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Pavel Yaskevich Labels: cql Fix For: 1.0 Attachments: screenshot-1.jpg, screenshot-2.jpg For the most part, this boils down to supporting the specification of compound column names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1165306 - in /cassandra/trunk: ./ src/java/org/apache/cassandra/db/ src/java/org/apache/cassandra/db/context/ src/java/org/apache/cassandra/io/sstable/ src/java/org/apache/cassandra/strea
Author: slebresne Date: Mon Sep 5 14:55:28 2011 New Revision: 1165306 URL: http://svn.apache.org/viewvc?rev=1165306view=rev Log: Handle large rows with single-pass streaming patch by yukim; reviewed by slebresne for CASSANDRA-3003 Modified: cassandra/trunk/CHANGES.txt cassandra/trunk/src/java/org/apache/cassandra/db/CounterColumn.java cassandra/trunk/src/java/org/apache/cassandra/db/context/CounterContext.java cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java cassandra/trunk/src/java/org/apache/cassandra/streaming/IncomingStreamReader.java Modified: cassandra/trunk/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/trunk/CHANGES.txt?rev=1165306r1=1165305r2=1165306view=diff == --- cassandra/trunk/CHANGES.txt (original) +++ cassandra/trunk/CHANGES.txt Mon Sep 5 14:55:28 2011 @@ -12,7 +12,7 @@ * don't bother persisting columns shadowed by a row tombstone (CASSANDRA-2589) * reset CF and SC deletion times after gc_grace (CASSANDRA-2317) * optimize away seek when compacting wide rows (CASSANDRA-2879) - * single-pass streaming (CASSANDRA-2677) + * single-pass streaming (CASSANDRA-2677, 3003) * use reference counting for deleting sstables instead of relying on GC (CASSANDRA-2521) * store hints as serialized mutations instead of pointers to data row Modified: cassandra/trunk/src/java/org/apache/cassandra/db/CounterColumn.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/CounterColumn.java?rev=1165306r1=1165305r2=1165306view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/db/CounterColumn.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/db/CounterColumn.java Mon Sep 5 14:55:28 2011 @@ -21,7 +21,6 @@ package org.apache.cassandra.db; import java.io.IOException; import java.nio.ByteBuffer; import java.security.MessageDigest; -import java.util.Map; import org.apache.log4j.Logger; @@ -70,7 +69,9 @@ public class CounterColumn extends Colum public static CounterColumn create(ByteBuffer name, ByteBuffer value, long timestamp, long timestampOfLastDelete, boolean fromRemote) { -if (fromRemote) +// #elt being negative means we have to clean delta +short count = value.getShort(value.position()); +if (fromRemote || count 0) value = CounterContext.instance().clearAllDelta(value); return new CounterColumn(name, value, timestamp, timestampOfLastDelete); } @@ -285,4 +286,8 @@ public class CounterColumn extends Colum } } +public IColumn markDeltaToBeCleared() +{ +return new CounterColumn(name, contextManager.markDeltaToBeCleared(value), timestamp, timestampOfLastDelete); +} } Modified: cassandra/trunk/src/java/org/apache/cassandra/db/context/CounterContext.java URL: http://svn.apache.org/viewvc/cassandra/trunk/src/java/org/apache/cassandra/db/context/CounterContext.java?rev=1165306r1=1165305r2=1165306view=diff == --- cassandra/trunk/src/java/org/apache/cassandra/db/context/CounterContext.java (original) +++ cassandra/trunk/src/java/org/apache/cassandra/db/context/CounterContext.java Mon Sep 5 14:55:28 2011 @@ -130,7 +130,7 @@ public class CounterContext implements I private static int headerLength(ByteBuffer context) { -return HEADER_SIZE_LENGTH + context.getShort(context.position()) * HEADER_ELT_LENGTH; +return HEADER_SIZE_LENGTH + Math.abs(context.getShort(context.position())) * HEADER_ELT_LENGTH; } private static int compareId(ByteBuffer bb1, int pos1, ByteBuffer bb2, int pos2) @@ -442,6 +442,28 @@ public class CounterContext implements I } /** + * Mark context to delete delta afterward. + * Marking is done by multiply #elt by -1 to preserve header length + * and #elt count in order to clear all delta later. + * + * @param context a counter context + * @return context that marked to delete delta + */ +public ByteBuffer markDeltaToBeCleared(ByteBuffer context) +{ +int headerLength = headerLength(context); +if (headerLength == 0) +return context; + +ByteBuffer marked = context.duplicate(); +short count = context.getShort(context.position()); +// negate #elt to mark as deleted, without changing its size. +if (count 0) +marked.putShort(marked.position(), (short) (count * -1)); +return marked; +} + +/** * Remove all the delta of a context (i.e, set an empty header). * * @param context a counter context Modified: cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java URL:
[jira] [Resolved] (CASSANDRA-3003) Trunk single-pass streaming doesn't handle large row correctly
[ https://issues.apache.org/jira/browse/CASSANDRA-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne resolved CASSANDRA-3003. - Resolution: Fixed Trunk single-pass streaming doesn't handle large row correctly -- Key: CASSANDRA-3003 URL: https://issues.apache.org/jira/browse/CASSANDRA-3003 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0 Reporter: Sylvain Lebresne Assignee: Yuki Morishita Priority: Critical Labels: streaming Fix For: 1.0 Attachments: 3003-v1.txt, 3003-v2.txt, 3003-v3.txt, 3003-v5.txt, v3003-v4.txt For normal column family, trunk streaming always buffer the whole row into memory. In uses {noformat} ColumnFamily.serializer().deserializeColumns(in, cf, true, true); {noformat} on the input bytes. We must avoid this for rows that don't fit in the inMemoryLimit. Note that for regular column families, for a given row, there is actually no need to even recreate the bloom filter of column index, nor to deserialize the columns. It is enough to filter the key and row size to feed the index writer, but then simply dump the rest on disk directly. This would make streaming more efficient, avoid a lot of object creation and avoid the pitfall of big rows. Counters column family are unfortunately trickier, because each column needs to be deserialized (to mark them as 'fromRemote'). However, we don't need to do the double pass of LazilyCompactedRow for that. We can simply use a SSTableIdentityIterator and deserialize/reserialize input as it comes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3003) Trunk single-pass streaming doesn't handle large row correctly
[ https://issues.apache.org/jira/browse/CASSANDRA-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097165#comment-13097165 ] Sylvain Lebresne commented on CASSANDRA-3003: - lgtm, +1. Committed with a tiny change to use a cheaper array backed column family in appendToStream, since we deserialize in order (and in a single thread). Trunk single-pass streaming doesn't handle large row correctly -- Key: CASSANDRA-3003 URL: https://issues.apache.org/jira/browse/CASSANDRA-3003 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0 Reporter: Sylvain Lebresne Assignee: Yuki Morishita Priority: Critical Labels: streaming Fix For: 1.0 Attachments: 3003-v1.txt, 3003-v2.txt, 3003-v3.txt, 3003-v5.txt, v3003-v4.txt For normal column family, trunk streaming always buffer the whole row into memory. In uses {noformat} ColumnFamily.serializer().deserializeColumns(in, cf, true, true); {noformat} on the input bytes. We must avoid this for rows that don't fit in the inMemoryLimit. Note that for regular column families, for a given row, there is actually no need to even recreate the bloom filter of column index, nor to deserialize the columns. It is enough to filter the key and row size to feed the index writer, but then simply dump the rest on disk directly. This would make streaming more efficient, avoid a lot of object creation and avoid the pitfall of big rows. Counters column family are unfortunately trickier, because each column needs to be deserialized (to mark them as 'fromRemote'). However, we don't need to do the double pass of LazilyCompactedRow for that. We can simply use a SSTableIdentityIterator and deserialize/reserialize input as it comes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097166#comment-13097166 ] Pavel Yaskevich commented on CASSANDRA-2474: if we consider that timestamp is key and event_id, category and subcategory is composite name then: bq. Give me all the events for time t, category c and sub-category sc {noformat} SELECT name AS (event_id, category, subcategory), value AS event FROM events WHERE key = timestamp AND category = name AND subcategory = name; {noformat} bq. Give me all the events for time t and category c {noformat} SELECT name AS (event_id, category, *), value AS event FROM events WHERE key = timestamp AND category = name; {noformat} bq. Give me all the events for time t and category c1 to c2 (where c1 c2 for the category sorting) {noformat} SELECT name AS (event_id, category, *), value AS event FROM events WHERE key = timestamp AND category c1 AND category c2; {noformat} bq. Give me everything for the last 4 hours {noformat} SELECT name AS (event_id, category, *), value AS event FROM events WHERE key timestamp AND key timestamp; {noformat} CQL support for compound columns Key: CASSANDRA-2474 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Pavel Yaskevich Labels: cql Fix For: 1.0 Attachments: screenshot-1.jpg, screenshot-2.jpg For the most part, this boils down to supporting the specification of compound column names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3003) Trunk single-pass streaming doesn't handle large row correctly
[ https://issues.apache.org/jira/browse/CASSANDRA-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097174#comment-13097174 ] Hudson commented on CASSANDRA-3003: --- Integrated in Cassandra #1074 (See [https://builds.apache.org/job/Cassandra/1074/]) Handle large rows with single-pass streaming patch by yukim; reviewed by slebresne for CASSANDRA-3003 slebresne : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1165306 Files : * /cassandra/trunk/CHANGES.txt * /cassandra/trunk/src/java/org/apache/cassandra/db/CounterColumn.java * /cassandra/trunk/src/java/org/apache/cassandra/db/context/CounterContext.java * /cassandra/trunk/src/java/org/apache/cassandra/io/sstable/SSTableWriter.java * /cassandra/trunk/src/java/org/apache/cassandra/streaming/IncomingStreamReader.java Trunk single-pass streaming doesn't handle large row correctly -- Key: CASSANDRA-3003 URL: https://issues.apache.org/jira/browse/CASSANDRA-3003 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 1.0 Reporter: Sylvain Lebresne Assignee: Yuki Morishita Priority: Critical Labels: streaming Fix For: 1.0 Attachments: 3003-v1.txt, 3003-v2.txt, 3003-v3.txt, 3003-v5.txt, v3003-v4.txt For normal column family, trunk streaming always buffer the whole row into memory. In uses {noformat} ColumnFamily.serializer().deserializeColumns(in, cf, true, true); {noformat} on the input bytes. We must avoid this for rows that don't fit in the inMemoryLimit. Note that for regular column families, for a given row, there is actually no need to even recreate the bloom filter of column index, nor to deserialize the columns. It is enough to filter the key and row size to feed the index writer, but then simply dump the rest on disk directly. This would make streaming more efficient, avoid a lot of object creation and avoid the pitfall of big rows. Counters column family are unfortunately trickier, because each column needs to be deserialized (to mark them as 'fromRemote'). However, we don't need to do the double pass of LazilyCompactedRow for that. We can simply use a SSTableIdentityIterator and deserialize/reserialize input as it comes. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3091) Move the caching of KS and CF metadata in the JDBC suite from Connection to Statement
[ https://issues.apache.org/jira/browse/CASSANDRA-3091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097178#comment-13097178 ] Rick Shaw commented on CASSANDRA-3091: -- Let's close this in deference to CASSANDRA-2734, and get that into trunk. Move the caching of KS and CF metadata in the JDBC suite from Connection to Statement - Key: CASSANDRA-3091 URL: https://issues.apache.org/jira/browse/CASSANDRA-3091 Project: Cassandra Issue Type: Improvement Components: Drivers Affects Versions: 0.8.4 Reporter: Rick Shaw Assignee: Rick Shaw Priority: Minor Labels: JDBC Fix For: 0.8.6 Attachments: move-metadata-for decoder-to-statement-level-v1.txt, move-metadata-for-decoder-to-statement-level-v2.txt Currently, all caching of metadata used in JDBC's {{ColumnDecoder}} class is loaded and held in the {{CassandraConnection}} class. The implication of this is that any activity on the connected server from the time the connection is established is not reflected in the KSs and CF that can be accessed by the {{ResultSet, Statement}} and {{PreparedStatement}}. By moving the cached metadata to the {{Statement}} level, the currency of the metadata can be checked within the {{Statement}} and reloaded if it is seen to be absent. And by instantiating a new {{Statement}} (on any existing connection) you are assured of getting the most current copy of the metadata known to the server at the new time of instantiation. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097184#comment-13097184 ] Sylvain Lebresne commented on CASSANDRA-2474: - Well, the timestamp was not meant to be the key in my example and the event_id needs to be the last component for this to make sense (since it is not specified in the query) but ok.. Now, I don't understand how: {noformat} SELECT name AS (category, subcategory, *), value AS event FROM events WHERE key = timestamp AND category = category AND subcategory = subcat; {noformat} is fundamentally different from {noformat} SELECT category:subcat:event_id, value FROM events WHERE key = timestamp; {noformat} which is roughly the proposition from CASSANDRA-2025. And I mean fundamentally different, not just from a syntax point of view (I have nothing against using parenthesis). If if it just a syntax difference, then fine. Or how {noformat} SELECT name AS (category, *), value AS event FROM events WHERE key = timestamp AND category c1 AND category c2; {noformat} is fundamentally different from {noformat} SELECT c1:*..c2:*, value FROM events WHERE key = timestamp; {noformat} Maybe giving an example of what is supposed to be the returned would start to show the differences, but so far it seems only a difference of syntax. And the discussions above suggests that there is more than that underneath. CQL support for compound columns Key: CASSANDRA-2474 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Pavel Yaskevich Labels: cql Fix For: 1.0 Attachments: screenshot-1.jpg, screenshot-2.jpg For the most part, this boils down to supporting the specification of compound column names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2434) node bootstrapping can violate consistency
[ https://issues.apache.org/jira/browse/CASSANDRA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097189#comment-13097189 ] paul cannon commented on CASSANDRA-2434: bq. Ok, so if we always prefer to bootstrap from the correct token, then I still think we should combine getRangesWithStrictSource and getRangesWithSources. Basically the logic should be, find the 'best' node to stream from. If the user requested it, also find a list of other candidates and order them by proximity. Right? I don't think so. I would still want to leave the option to stream from the closest even if the strict best node is available. node bootstrapping can violate consistency -- Key: CASSANDRA-2434 URL: https://issues.apache.org/jira/browse/CASSANDRA-2434 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: paul cannon Fix For: 1.1 Attachments: 2434.patch.txt My reading (a while ago) of the code indicates that there is no logic involved during bootstrapping that avoids consistency level violations. If I recall correctly it just grabs neighbors that are currently up. There are at least two issues I have with this behavior: * If I have a cluster where I have applications relying on QUORUM with RF=3, and bootstrapping complete based on only one node, I have just violated the supposedly guaranteed consistency semantics of the cluster. * Nodes can flap up and down at any time, so even if a human takes care to look at which nodes are up and things about it carefully before bootstrapping, there's no guarantee. A complication is that not only does it depend on use-case where this is an issue (if all you ever do you do at CL.ONE, it's fine); even in a cluster which is otherwise used for QUORUM operations you may wish to accept less-than-quorum nodes during bootstrap in various emergency situations. A potential easy fix is to have bootstrap take an argument which is the number of hosts to bootstrap from, or to assume QUORUM if none is given. (A related concern is bootstrapping across data centers. You may *want* to bootstrap to a local node and then do a repair to avoid sending loads of data across DC:s while still achieving consistency. Or even if you don't care about the consistency issues, I don't think there is currently a way to bootstrap from local nodes only.) Thoughts? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2434) node bootstrapping can violate consistency
[ https://issues.apache.org/jira/browse/CASSANDRA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097190#comment-13097190 ] paul cannon commented on CASSANDRA-2434: bq. I'm not sure I understand, are you saying that B would violate this, or just that the status quo does? I'm saying B would violate this, yes. B was bootstrap from the right token, but if that one isn't up, bootstrap from any other token preferring the closer ones, right? I'm saying we can't just automatically choose another token if the user didn't specifically say it's ok. node bootstrapping can violate consistency -- Key: CASSANDRA-2434 URL: https://issues.apache.org/jira/browse/CASSANDRA-2434 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: paul cannon Fix For: 1.1 Attachments: 2434.patch.txt My reading (a while ago) of the code indicates that there is no logic involved during bootstrapping that avoids consistency level violations. If I recall correctly it just grabs neighbors that are currently up. There are at least two issues I have with this behavior: * If I have a cluster where I have applications relying on QUORUM with RF=3, and bootstrapping complete based on only one node, I have just violated the supposedly guaranteed consistency semantics of the cluster. * Nodes can flap up and down at any time, so even if a human takes care to look at which nodes are up and things about it carefully before bootstrapping, there's no guarantee. A complication is that not only does it depend on use-case where this is an issue (if all you ever do you do at CL.ONE, it's fine); even in a cluster which is otherwise used for QUORUM operations you may wish to accept less-than-quorum nodes during bootstrap in various emergency situations. A potential easy fix is to have bootstrap take an argument which is the number of hosts to bootstrap from, or to assume QUORUM if none is given. (A related concern is bootstrapping across data centers. You may *want* to bootstrap to a local node and then do a repair to avoid sending loads of data across DC:s while still achieving consistency. Or even if you don't care about the consistency issues, I don't think there is currently a way to bootstrap from local nodes only.) Thoughts? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2434) node bootstrapping can violate consistency
[ https://issues.apache.org/jira/browse/CASSANDRA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097204#comment-13097204 ] Nick Bailey commented on CASSANDRA-2434: Paul, The suggestion was that if the 'correct' node is down, you can force the bootstrap to complete anyway (probably from the closest node, but that is transparent to the user), but only if the 'correct' node is down. It sounds like you agree with Jonathan on the more general approach though. Zhu, Repair doesn't help in the case when you lost data due to a node going down. Also if only one node is down you should still be able to read/write at quorum and achieve consistency (assuming your replication factor is greater than 2). node bootstrapping can violate consistency -- Key: CASSANDRA-2434 URL: https://issues.apache.org/jira/browse/CASSANDRA-2434 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: paul cannon Fix For: 1.1 Attachments: 2434.patch.txt My reading (a while ago) of the code indicates that there is no logic involved during bootstrapping that avoids consistency level violations. If I recall correctly it just grabs neighbors that are currently up. There are at least two issues I have with this behavior: * If I have a cluster where I have applications relying on QUORUM with RF=3, and bootstrapping complete based on only one node, I have just violated the supposedly guaranteed consistency semantics of the cluster. * Nodes can flap up and down at any time, so even if a human takes care to look at which nodes are up and things about it carefully before bootstrapping, there's no guarantee. A complication is that not only does it depend on use-case where this is an issue (if all you ever do you do at CL.ONE, it's fine); even in a cluster which is otherwise used for QUORUM operations you may wish to accept less-than-quorum nodes during bootstrap in various emergency situations. A potential easy fix is to have bootstrap take an argument which is the number of hosts to bootstrap from, or to assume QUORUM if none is given. (A related concern is bootstrapping across data centers. You may *want* to bootstrap to a local node and then do a repair to avoid sending loads of data across DC:s while still achieving consistency. Or even if you don't care about the consistency issues, I don't think there is currently a way to bootstrap from local nodes only.) Thoughts? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3128) Replace compression and compression_options config parameters by just a compression_options map.
[ https://issues.apache.org/jira/browse/CASSANDRA-3128?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sylvain Lebresne updated CASSANDRA-3128: Attachment: 0002-Use-only-one-options-for-compression.patch 0001-Thrift-files.patch Replace compression and compression_options config parameters by just a compression_options map. Key: CASSANDRA-3128 URL: https://issues.apache.org/jira/browse/CASSANDRA-3128 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 1.0 Reporter: Sylvain Lebresne Assignee: Sylvain Lebresne Priority: Minor Fix For: 1.0 Attachments: 0001-Thrift-files.patch, 0002-Use-only-one-options-for-compression.patch As suggested on CASSANDRA-3105, as long as 1.0 is not out, we could replace the 'compression' and 'compression_options' parameters by just one that would allow to write: {noformat} compression_options = { sstable_compression: SnappyCompressor, block_length_kb: 32 } {noformat} This would allow for more future-proof, in particular if we decide to make CASSANDRA-3015 pluggable in the future or for CASSANDRA-3127 as this would allow us to simply evolve to say: {noformat} compression_options = { sstable_compression: SnappyCompressor, block_length_kb: 32, stream_compression: LZFCompressor } {noformat} This has the advantages of (1) not polluting CfDef and (2) leaving the option of documenting some option only in advanced documentation (if said option is not meant for new users) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097225#comment-13097225 ] Pavel Yaskevich commented on CASSANDRA-2474: The core difference is that (..,..,..) notation will return given aliases (category, subcategory) as column names in the results. CQL support for compound columns Key: CASSANDRA-2474 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Pavel Yaskevich Labels: cql Fix For: 1.0 Attachments: screenshot-1.jpg, screenshot-2.jpg For the most part, this boils down to supporting the specification of compound column names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097231#comment-13097231 ] Sylvain Lebresne commented on CASSANDRA-2474: - bq. The core difference is that (..,..,..) notation will return given aliases (category, subcategory) as column names in the results But how will it do that ? The result of {noformat} SELECT c1:*..c2:*, value FROM events WHERE key = timestamp; {noformat} would be something like {noformat} Key | c1:subc1 | c1:subc2 | c1:subc3 | c2:subc1 | timestamp | event_value1 | event_value2 | event_value3 | event_value4 | {noformat} How do the result look like with 'given aliases (category, subcategory) as column names in the results' ? CQL support for compound columns Key: CASSANDRA-2474 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Pavel Yaskevich Labels: cql Fix For: 1.0 Attachments: screenshot-1.jpg, screenshot-2.jpg For the most part, this boils down to supporting the specification of compound column names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097242#comment-13097242 ] Jonathan Ellis edited comment on CASSANDRA-2474 at 9/5/11 6:50 PM: --- bq. composite/super columns won't originally play nice with SQL syntax because it wasn't designed to query hierarchical data That's exactly what transposition solves -- taking a horizontal slice and turning it sideways into a resultset with the same set of columns. We are NOT trying to solve a more generic hierarchical data problem -- all the (leaf) data we select has to be at the same level in the hierarchy. bq. if we have 10 subcolumns do I need to list them all using component syntax You will if you are using the dense format. And let's be clear: this is NOT the recommended way to do things, because it is fragile, as described above. We want to support it, but making it beautiful is not our goal. bq. it lacks scoping therefore on the big queries it will be hard to read, e.g. SELECT component1 AS tweet_id, component2 AS username, body, location, age, value I don't understand, that seems perfectly readable to me. bq. SELECT name AS (tweet_id, username | body | location | age), value AS body This syntax is not viable for the reasons given in my previous comment. I'm happy to entertain other alternatives to the component syntax but there's no need to spend further time on this one. was (Author: jbellis): bq. composite/super columns won't originally play nice with SQL syntax because it wasn't designed to query hierarchical data That's exactly what transposition solves -- taking a horizontal slice and turning it sideways into a resultset with the same set of columns. We are NOT trying to solve a more generic hierarchical data problem -- all the (leaf) data we select has to be at the same level in the hierarchy. bq. if we have 10 subcolumns do I need to list them all using component syntax You will if you are using the dense format. And let's be clear: this is NOT the recommended way to do things, because it is fragile, as described above. We want to support it, but making it beautiful is not our goal. bq. will potentially be hard to put into grammar because it can have ambiguous rules again because lack of scoping bq. it lacks scoping therefore on the big queries it will be hard to read e.g. SELECT component1 AS tweet_id, component2 AS username, body, location, age, value I don't understand, that seems perfectly readable to me. bq. SELECT name AS (tweet_id, username | body | location | age), value AS body This syntax is not viable for the reasons given in my previous comment. I'm happy to entertain other alternatives to the component syntax but there's no need to spend further time on this one. CQL support for compound columns Key: CASSANDRA-2474 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Pavel Yaskevich Labels: cql Fix For: 1.0 Attachments: screenshot-1.jpg, screenshot-2.jpg For the most part, this boils down to supporting the specification of compound column names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097242#comment-13097242 ] Jonathan Ellis commented on CASSANDRA-2474: --- bq. composite/super columns won't originally play nice with SQL syntax because it wasn't designed to query hierarchical data That's exactly what transposition solves -- taking a horizontal slice and turning it sideways into a resultset with the same set of columns. We are NOT trying to solve a more generic hierarchical data problem -- all the (leaf) data we select has to be at the same level in the hierarchy. bq. if we have 10 subcolumns do I need to list them all using component syntax You will if you are using the dense format. And let's be clear: this is NOT the recommended way to do things, because it is fragile, as described above. We want to support it, but making it beautiful is not our goal. bq. will potentially be hard to put into grammar because it can have ambiguous rules again because lack of scoping bq. it lacks scoping therefore on the big queries it will be hard to read e.g. SELECT component1 AS tweet_id, component2 AS username, body, location, age, value I don't understand, that seems perfectly readable to me. bq. SELECT name AS (tweet_id, username | body | location | age), value AS body This syntax is not viable for the reasons given in my previous comment. I'm happy to entertain other alternatives to the component syntax but there's no need to spend further time on this one. CQL support for compound columns Key: CASSANDRA-2474 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Pavel Yaskevich Labels: cql Fix For: 1.0 Attachments: screenshot-1.jpg, screenshot-2.jpg For the most part, this boils down to supporting the specification of compound column names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-2474) CQL support for compound columns
[ https://issues.apache.org/jira/browse/CASSANDRA-2474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097242#comment-13097242 ] Jonathan Ellis edited comment on CASSANDRA-2474 at 9/5/11 6:50 PM: --- bq. composite/super columns won't originally play nice with SQL syntax because it wasn't designed to query hierarchical data That's exactly what transposition solves -- taking a horizontal slice and turning it sideways into a resultset with the same set of columns. We are NOT trying to solve a more generic hierarchical data problem -- all the (leaf) data we select has to be at the same level in the hierarchy. bq. if we have 10 subcolumns do I need to list them all using component syntax You will if you are using the dense format. And let's be clear: this is NOT the recommended way to do things, because it is fragile, as described above. We want to support it [dense], but making it beautiful is not our goal. Sparse encoding will be the recommended practice. bq. it lacks scoping therefore on the big queries it will be hard to read, e.g. SELECT component1 AS tweet_id, component2 AS username, body, location, age, value I don't understand, that seems perfectly readable to me. bq. SELECT name AS (tweet_id, username | body | location | age), value AS body This syntax is not viable for the reasons given in my previous comment. I'm happy to entertain other alternatives to the component syntax but there's no need to spend further time on this one. was (Author: jbellis): bq. composite/super columns won't originally play nice with SQL syntax because it wasn't designed to query hierarchical data That's exactly what transposition solves -- taking a horizontal slice and turning it sideways into a resultset with the same set of columns. We are NOT trying to solve a more generic hierarchical data problem -- all the (leaf) data we select has to be at the same level in the hierarchy. bq. if we have 10 subcolumns do I need to list them all using component syntax You will if you are using the dense format. And let's be clear: this is NOT the recommended way to do things, because it is fragile, as described above. We want to support it, but making it beautiful is not our goal. bq. it lacks scoping therefore on the big queries it will be hard to read, e.g. SELECT component1 AS tweet_id, component2 AS username, body, location, age, value I don't understand, that seems perfectly readable to me. bq. SELECT name AS (tweet_id, username | body | location | age), value AS body This syntax is not viable for the reasons given in my previous comment. I'm happy to entertain other alternatives to the component syntax but there's no need to spend further time on this one. CQL support for compound columns Key: CASSANDRA-2474 URL: https://issues.apache.org/jira/browse/CASSANDRA-2474 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Eric Evans Assignee: Pavel Yaskevich Labels: cql Fix For: 1.0 Attachments: screenshot-1.jpg, screenshot-2.jpg For the most part, this boils down to supporting the specification of compound column names (the CQL syntax is colon-delimted terms), and then teaching the decoders (drivers) to create structures from the results. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3031) Add 4 byte integer type
[ https://issues.apache.org/jira/browse/CASSANDRA-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097244#comment-13097244 ] Jonathan Ellis commented on CASSANDRA-3031: --- I hate to do that because of the WTF-inducement that will have on novices. CQL is young enough that we should be trying to optimize for the thousands of people who have never tried it yet, not the tens of people (being generous) who have used it in a real system. Calling it 1.0 instead of 0.1 doesn't change that. Add 4 byte integer type --- Key: CASSANDRA-3031 URL: https://issues.apache.org/jira/browse/CASSANDRA-3031 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.4 Environment: any Reporter: Radim Kolar Priority: Minor Labels: hector, lhf Fix For: 1.0 Attachments: apache-cassandra-0.8.4-SNAPSHOT.jar, src.diff, test.diff Cassandra currently lacks support for 4byte fixed size integer data type. Java API Hector and C libcassandra likes to serialize integers as 4 bytes in network order. Problem is that you cant use cassandra-cli to manipulate stored rows. Compatibility with other applications using api following cassandra integer encoding standard is problematic too. Because adding new datatype/validator is fairly simple I recommend to add int4 data type. Compatibility with hector is important because it is most used Java cassandra api and lot of applications are using it. This problem was discussed several times already http://comments.gmane.org/gmane.comp.db.hector.user/2125 https://issues.apache.org/jira/browse/CASSANDRA-2585 It would be nice to have compatibility with cassandra-cli and other applications without rewriting hector apps. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1165388 - /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/
Author: jbellis Date: Mon Sep 5 18:55:02 2011 New Revision: 1165388 URL: http://svn.apache.org/viewvc?rev=1165388view=rev Log: clean up JDBC class declarations and accessibility modifiers patch by Rick Shaw for CASSANDRA-3135 Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractCassandraConnection.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractResultSet.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractStatement.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CResultSet.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CassandraConnection.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CassandraDatabaseMetaData.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CassandraPreparedStatement.java cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/ColumnDecoder.java Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractCassandraConnection.java URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractCassandraConnection.java?rev=1165388r1=1165387r2=1165388view=diff == --- cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractCassandraConnection.java (original) +++ cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractCassandraConnection.java Mon Sep 5 18:55:02 2011 @@ -33,7 +33,7 @@ import java.sql.Savepoint; import java.sql.Struct; import java.util.Map; -public class AbstractCassandraConnection +abstract class AbstractCassandraConnection { protected static final String NOT_SUPPORTED = the Cassandra implementation does not support this method; Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractResultSet.java URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractResultSet.java?rev=1165388r1=1165387r2=1165388view=diff == --- cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractResultSet.java (original) +++ cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractResultSet.java Mon Sep 5 18:55:02 2011 @@ -26,7 +26,7 @@ import java.sql.*; import java.util.Map; /** a class to hold all the unimplemented crap */ -class AbstractResultSet +abstract class AbstractResultSet { protected static final String NOT_SUPPORTED = the Cassandra implementation does not support this method; Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractStatement.java URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractStatement.java?rev=1165388r1=1165387r2=1165388view=diff == --- cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractStatement.java (original) +++ cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractStatement.java Mon Sep 5 18:55:02 2011 @@ -24,7 +24,7 @@ import java.sql.ResultSet; import java.sql.SQLException; import java.sql.SQLFeatureNotSupportedException; -public class AbstractStatement +abstract class AbstractStatement { protected static final String NOT_SUPPORTED = the Cassandra implementation does not support this method; Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CResultSet.java URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CResultSet.java?rev=1165388r1=1165387r2=1165388view=diff == --- cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CResultSet.java (original) +++ cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CResultSet.java Mon Sep 5 18:55:02 2011 @@ -36,7 +36,7 @@ import org.apache.cassandra.thrift.CqlRe import org.apache.cassandra.thrift.CqlRow; import org.apache.cassandra.utils.ByteBufferUtil; -public class CResultSet extends AbstractResultSet implements CassandraResultSet +class CResultSet extends AbstractResultSet implements CassandraResultSet { public static final int DEFAULT_TYPE = ResultSet.TYPE_FORWARD_ONLY; public static final int DEFAULT_CONCURRENCY = ResultSet.CONCUR_READ_ONLY; Modified: cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CassandraConnection.java URL: http://svn.apache.org/viewvc/cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CassandraConnection.java?rev=1165388r1=1165387r2=1165388view=diff == ---
[jira] [Updated] (CASSANDRA-3135) Tighten class accessibility in JDBC Suite
[ https://issues.apache.org/jira/browse/CASSANDRA-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3135: -- Affects Version/s: (was: 0.8.4) Tighten class accessibility in JDBC Suite - Key: CASSANDRA-3135 URL: https://issues.apache.org/jira/browse/CASSANDRA-3135 Project: Cassandra Issue Type: Improvement Components: Drivers Reporter: Rick Shaw Assignee: Rick Shaw Priority: Trivial Labels: JDBC Attachments: tighten-accessability.txt Tighten up class accessibility by making classes in the suite that are not intended to be instantiated by a client directly remove the {{public}} modifier. In addition make abstract named classes use the {{abstract}} modifier. And finally make methods that are not part of public interfaces but shared in the package be marked {{protected}}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097250#comment-13097250 ] Jonathan Ellis commented on CASSANDRA-957: -- Can you add a short how-to to NEWS.txt describing this feature? convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-support-for-replace-token-v3.patch, 0001-support-token-replace-v4.patch, 0001-support-token-replace-v5.patch, 0001-support-token-replace-v6.patch, 0001-support-token-replace-v7.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-hints-on-token-than-ip-v4.patch, 0002-hints-on-token-than-ip-v5.patch, 0002-hints-on-token-than-ip-v6.patch, 0002-upport-for-hints-on-token-v3.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-1311) Support (asynchronous) triggers
[ https://issues.apache.org/jira/browse/CASSANDRA-1311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097252#comment-13097252 ] Jonathan Ellis commented on CASSANDRA-1311: --- bq. The big minus for the replica level triggers is that no one really wants to get N triggers That's because, as I've said before, this only really makes sense to me for user-level data once you have entity groups. Coordinator-level triggers are an ugly ball of corner cases that add no functionality over what you can do with a well-designed app-level storage layer. It's an idea that is superficially attractive but is a non-starter once you dig deeper. Support (asynchronous) triggers --- Key: CASSANDRA-1311 URL: https://issues.apache.org/jira/browse/CASSANDRA-1311 Project: Cassandra Issue Type: New Feature Components: Contrib Reporter: Maxim Grinev Fix For: 1.1 Attachments: HOWTO-PatchAndRunTriggerExample-update1.txt, HOWTO-PatchAndRunTriggerExample.txt, ImplementationDetails-update1.pdf, ImplementationDetails.pdf, trunk-967053.txt, trunk-984391-update1.txt, trunk-984391-update2.txt Asynchronous triggers is a basic mechanism to implement various use cases of asynchronous execution of application code at database side. For example to support indexes and materialized views, online analytics, push-based data propagation. Please find the motivation, triggers description and list of applications: http://maxgrinev.com/2010/07/23/extending-cassandra-with-asynchronous-triggers/ An example of using triggers for indexing: http://maxgrinev.com/2010/07/23/managing-indexes-in-cassandra-using-async-triggers/ Implementation details are attached. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3069) multiget support in CQL (UNION / OR)
[ https://issues.apache.org/jira/browse/CASSANDRA-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3069: -- Fix Version/s: (was: 1.0) 1.1 Summary: multiget support in CQL (UNION / OR) (was: UNION support) Pushing to 1.1 -- transposition is more important for 1.0 multiget support in CQL (UNION / OR) Key: CASSANDRA-3069 URL: https://issues.apache.org/jira/browse/CASSANDRA-3069 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.1 Attachments: CASSANDRA-3069.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3069) UNION support
[ https://issues.apache.org/jira/browse/CASSANDRA-3069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097262#comment-13097262 ] Jonathan Ellis commented on CASSANDRA-3069: --- The more I think about it the more I think that the right thing to do is not to support UNION (which does not map cleanly to StorageProxy calls) but to support ORing keys together in the WHERE clause (which maps nicely to multiget). The downside is we need to explain why we support just this one special case of ORs but not others, but I think that is better than trying to explain other limitations around UNION. UNION support - Key: CASSANDRA-3069 URL: https://issues.apache.org/jira/browse/CASSANDRA-3069 Project: Cassandra Issue Type: Sub-task Components: API, Core Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Fix For: 1.1 Attachments: CASSANDRA-3069.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3124) java heap limit for nodetool
[ https://issues.apache.org/jira/browse/CASSANDRA-3124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097266#comment-13097266 ] Jonathan Ellis commented on CASSANDRA-3124: --- So you're saying that 64 / 96 MB is _too large_ and you have to reduce it for nodetool to run? I'd say there's something wrong with your environment. java heap limit for nodetool Key: CASSANDRA-3124 URL: https://issues.apache.org/jira/browse/CASSANDRA-3124 Project: Cassandra Issue Type: Improvement Components: Core, Tools Affects Versions: 0.8.1, 0.8.2, 0.8.3, 0.8.4 Environment: not important Reporter: Zenek Kraweznik Priority: Minor by defaull (from debian package) # nodetool Error occurred during initialization of VM Could not reserve enough space for object heap Could not create the Java virtual machine. # and: --- /usr/bin/nodetool.old 2011-09-02 14:15:14.228152799 +0200 +++ /usr/bin/nodetool 2011-09-02 14:14:28.745154552 +0200 @@ -55,7 +55,7 @@ ;; esac -$JAVA -cp $CLASSPATH -Dstorage-config=$CASSANDRA_CONF \ +$JAVA -Xmx32m -cp $CLASSPATH -Dstorage-config=$CASSANDRA_CONF \ -Dlog4j.configuration=log4j-tools.properties \ org.apache.cassandra.tools.NodeCmd $@ after every upgrade i had to add limit manually. I think it's good idea to add it by default ;) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3133) nodetool netstats doesn't show streams during decommission
[ https://issues.apache.org/jira/browse/CASSANDRA-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097268#comment-13097268 ] Jonathan Ellis commented on CASSANDRA-3133: --- So really it is not certain whether the problem is nodetool not showing streams, or decommission not finishing after streaming is complete? nodetool netstats doesn't show streams during decommission -- Key: CASSANDRA-3133 URL: https://issues.apache.org/jira/browse/CASSANDRA-3133 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.8.4 Environment: debian 6.0.2.1 (squeeze), java 1.6.26 (sun, non-free packages). Reporter: Zenek Kraweznik nodetool netstats is now showing transferred files from demonission -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3108) Make Range and Bounds objects client-safe
[ https://issues.apache.org/jira/browse/CASSANDRA-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097271#comment-13097271 ] Jonathan Ellis commented on CASSANDRA-3108: --- That was unintentional -- how did I do that? Make Range and Bounds objects client-safe - Key: CASSANDRA-3108 URL: https://issues.apache.org/jira/browse/CASSANDRA-3108 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Mck SembWever Labels: hadoop Fix For: 0.8.5 Attachments: 3108.txt From Mck's comment on CASSANDRA-1125: Something broke here in production once we went out with 0.8.2. It may have been some poor testing, i'm not entirely sure and a little surprised. CFIF:135 breaks because inside dhtRange.intersects(jobRange) there's a call to new Range(token, token) which calls StorageService.getPartitioner() and StorageService is null as we're not inside the server. A quick fix is to change Range:148 from new Range(token, token) to new Range(token, token, partitioner) making the presumption that the partitioner for the new Range will be the same as this Range. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (CASSANDRA-3136) Allow CFIF to keep going despite unavailable ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3136. --- Resolution: Won't Fix As explained when this was injected into another ticket, supporting this very niche scenario is not worth adding complexity to our Hadoop interface. The right way to support fault-tolerant queries is to increase RF. Allow CFIF to keep going despite unavailable ranges --- Key: CASSANDRA-3136 URL: https://issues.apache.org/jira/browse/CASSANDRA-3136 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Mck SembWever Priority: Minor From http://thread.gmane.org/gmane.comp.db.cassandra.user/18902 use-case-1 from=Patrik Modesto We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute lexical and thematical data for each page and for exporting the data to a binary files for later use. URL gets to a Cassandra on user request (a pageview) so if we delete an URL, it gets back quickly if the page is active. Because of that and because there is lots of data, we have the keyspace set to RF=1. We can drop the whole keyspace and it will regenerate quickly and would contain only fresh data, so we don't care about lossing a node. /use-case-1 use-case-2 trying to extract a small random sample (like a pig SAMPLE) of data out of cassandra. /use-case-2 use-case-3 searching for something or some-pattern and one hit is enough. If you get the hit it's a positive result regardless if ranges were ignored, if you don't and you *know* there was a range ignored along the way you can re-run the job later. For example such a job could be run at regular intervals in the day until a hit was found. /use-case-3 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3135) Tighten class accessibility in JDBC Suite
[ https://issues.apache.org/jira/browse/CASSANDRA-3135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097274#comment-13097274 ] Hudson commented on CASSANDRA-3135: --- Integrated in Cassandra #1075 (See [https://builds.apache.org/job/Cassandra/1075/]) clean up JDBC class declarations and accessibility modifiers patch by Rick Shaw for CASSANDRA-3135 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1165388 Files : * /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractCassandraConnection.java * /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractResultSet.java * /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/AbstractStatement.java * /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CResultSet.java * /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CassandraConnection.java * /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CassandraDatabaseMetaData.java * /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/CassandraPreparedStatement.java * /cassandra/trunk/drivers/java/src/org/apache/cassandra/cql/jdbc/ColumnDecoder.java Tighten class accessibility in JDBC Suite - Key: CASSANDRA-3135 URL: https://issues.apache.org/jira/browse/CASSANDRA-3135 Project: Cassandra Issue Type: Improvement Components: Drivers Reporter: Rick Shaw Assignee: Rick Shaw Priority: Trivial Labels: JDBC Attachments: tighten-accessability.txt Tighten up class accessibility by making classes in the suite that are not intended to be instantiated by a client directly remove the {{public}} modifier. In addition make abstract named classes use the {{abstract}} modifier. And finally make methods that are not part of public interfaces but shared in the package be marked {{protected}}. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3031) Add 4 byte integer type
[ https://issues.apache.org/jira/browse/CASSANDRA-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097244#comment-13097244 ] Jonathan Ellis edited comment on CASSANDRA-3031 at 9/5/11 7:26 PM: --- bq. How about deprecating int and introducing int4 and int8? I hate to do that because of the WTF-inducement that will have on novices. CQL is young enough that we should be trying to optimize for the thousands of people who have never tried it yet, not the tens of people (being generous) who have used it in a real system. Calling it 1.0 instead of 0.1 doesn't change that. was (Author: jbellis): I hate to do that because of the WTF-inducement that will have on novices. CQL is young enough that we should be trying to optimize for the thousands of people who have never tried it yet, not the tens of people (being generous) who have used it in a real system. Calling it 1.0 instead of 0.1 doesn't change that. Add 4 byte integer type --- Key: CASSANDRA-3031 URL: https://issues.apache.org/jira/browse/CASSANDRA-3031 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.4 Environment: any Reporter: Radim Kolar Priority: Minor Labels: hector, lhf Fix For: 1.0 Attachments: apache-cassandra-0.8.4-SNAPSHOT.jar, src.diff, test.diff Cassandra currently lacks support for 4byte fixed size integer data type. Java API Hector and C libcassandra likes to serialize integers as 4 bytes in network order. Problem is that you cant use cassandra-cli to manipulate stored rows. Compatibility with other applications using api following cassandra integer encoding standard is problematic too. Because adding new datatype/validator is fairly simple I recommend to add int4 data type. Compatibility with hector is important because it is most used Java cassandra api and lot of applications are using it. This problem was discussed several times already http://comments.gmane.org/gmane.comp.db.hector.user/2125 https://issues.apache.org/jira/browse/CASSANDRA-2585 It would be nice to have compatibility with cassandra-cli and other applications without rewriting hector apps. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3031) Add 4 byte integer type
[ https://issues.apache.org/jira/browse/CASSANDRA-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097244#comment-13097244 ] Jonathan Ellis edited comment on CASSANDRA-3031 at 9/5/11 7:27 PM: --- bq. How about deprecating int and introducing int4 and int8? I hate to do that because of the WTF-inducement that will have on novices. CQL is young enough that we should be trying to optimize for the thousands of people who have never tried it yet, not the tens of people (being generous) who have used it in a real system. Calling [the original CQL API released in 0.8] 1.0 instead of 0.1 doesn't change that. was (Author: jbellis): bq. How about deprecating int and introducing int4 and int8? I hate to do that because of the WTF-inducement that will have on novices. CQL is young enough that we should be trying to optimize for the thousands of people who have never tried it yet, not the tens of people (being generous) who have used it in a real system. Calling it 1.0 instead of 0.1 doesn't change that. Add 4 byte integer type --- Key: CASSANDRA-3031 URL: https://issues.apache.org/jira/browse/CASSANDRA-3031 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.4 Environment: any Reporter: Radim Kolar Priority: Minor Labels: hector, lhf Fix For: 1.0 Attachments: apache-cassandra-0.8.4-SNAPSHOT.jar, src.diff, test.diff Cassandra currently lacks support for 4byte fixed size integer data type. Java API Hector and C libcassandra likes to serialize integers as 4 bytes in network order. Problem is that you cant use cassandra-cli to manipulate stored rows. Compatibility with other applications using api following cassandra integer encoding standard is problematic too. Because adding new datatype/validator is fairly simple I recommend to add int4 data type. Compatibility with hector is important because it is most used Java cassandra api and lot of applications are using it. This problem was discussed several times already http://comments.gmane.org/gmane.comp.db.hector.user/2125 https://issues.apache.org/jira/browse/CASSANDRA-2585 It would be nice to have compatibility with cassandra-cli and other applications without rewriting hector apps. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3140) Expose server, api versions to CQL
Expose server, api versions to CQL -- Key: CASSANDRA-3140 URL: https://issues.apache.org/jira/browse/CASSANDRA-3140 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Priority: Minor Fix For: 1.0 Need to expose the CQL api version; might as well include the server version while we're at it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3140) Expose server, api versions to CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097278#comment-13097278 ] Jonathan Ellis commented on CASSANDRA-3140: --- Maybe just SELECT api_version(), server_version() ? Open to suggestions that are less one-off-ish. Expose server, api versions to CQL -- Key: CASSANDRA-3140 URL: https://issues.apache.org/jira/browse/CASSANDRA-3140 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Priority: Minor Fix For: 1.0 Need to expose the CQL api version; might as well include the server version while we're at it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3031) Add 4 byte integer type
[ https://issues.apache.org/jira/browse/CASSANDRA-3031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097279#comment-13097279 ] Jonathan Ellis commented on CASSANDRA-3031: --- Speaking of CQL versioning, created CASSANDRA-3140. Add 4 byte integer type --- Key: CASSANDRA-3031 URL: https://issues.apache.org/jira/browse/CASSANDRA-3031 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.4 Environment: any Reporter: Radim Kolar Priority: Minor Labels: hector, lhf Fix For: 1.0 Attachments: apache-cassandra-0.8.4-SNAPSHOT.jar, src.diff, test.diff Cassandra currently lacks support for 4byte fixed size integer data type. Java API Hector and C libcassandra likes to serialize integers as 4 bytes in network order. Problem is that you cant use cassandra-cli to manipulate stored rows. Compatibility with other applications using api following cassandra integer encoding standard is problematic too. Because adding new datatype/validator is fairly simple I recommend to add int4 data type. Compatibility with hector is important because it is most used Java cassandra api and lot of applications are using it. This problem was discussed several times already http://comments.gmane.org/gmane.comp.db.hector.user/2125 https://issues.apache.org/jira/browse/CASSANDRA-2585 It would be nice to have compatibility with cassandra-cli and other applications without rewriting hector apps. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3136) Allow CFIF to keep going despite unavailable ranges
[ https://issues.apache.org/jira/browse/CASSANDRA-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097280#comment-13097280 ] Mck SembWever commented on CASSANDRA-3136: -- Ok... it was mentioned in CASSANDRA-2388 (by Patrik Modesto). but no one there paid it any attention as it didn't belong to that issue. Allow CFIF to keep going despite unavailable ranges --- Key: CASSANDRA-3136 URL: https://issues.apache.org/jira/browse/CASSANDRA-3136 Project: Cassandra Issue Type: Improvement Components: Hadoop Reporter: Mck SembWever Priority: Minor From http://thread.gmane.org/gmane.comp.db.cassandra.user/18902 use-case-1 from=Patrik Modesto We use Cassandra as a storage for web-pages, we store the HTML, all URLs that has the same HTML data and some computed data. We run Hadoop MR jobs to compute lexical and thematical data for each page and for exporting the data to a binary files for later use. URL gets to a Cassandra on user request (a pageview) so if we delete an URL, it gets back quickly if the page is active. Because of that and because there is lots of data, we have the keyspace set to RF=1. We can drop the whole keyspace and it will regenerate quickly and would contain only fresh data, so we don't care about lossing a node. /use-case-1 use-case-2 trying to extract a small random sample (like a pig SAMPLE) of data out of cassandra. /use-case-2 use-case-3 searching for something or some-pattern and one hit is enough. If you get the hit it's a positive result regardless if ranges were ignored, if you don't and you *know* there was a range ignored along the way you can re-run the job later. For example such a job could be run at regular intervals in the day until a hit was found. /use-case-3 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3108) Make Range and Bounds objects client-safe
[ https://issues.apache.org/jira/browse/CASSANDRA-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097282#comment-13097282 ] Mck SembWever commented on CASSANDRA-3108: -- You drastically removed the usage of the {{Range(left, right)}} constructor so that even the usage of {{intersectionBothWrapping(..)}} and {{intersectionOneWrapping(..)}} avoids any server-side calls. Make Range and Bounds objects client-safe - Key: CASSANDRA-3108 URL: https://issues.apache.org/jira/browse/CASSANDRA-3108 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Mck SembWever Labels: hadoop Fix For: 0.8.5 Attachments: 3108.txt From Mck's comment on CASSANDRA-1125: Something broke here in production once we went out with 0.8.2. It may have been some poor testing, i'm not entirely sure and a little surprised. CFIF:135 breaks because inside dhtRange.intersects(jobRange) there's a call to new Range(token, token) which calls StorageService.getPartitioner() and StorageService is null as we're not inside the server. A quick fix is to change Range:148 from new Range(token, token) to new Range(token, token, partitioner) making the presumption that the partitioner for the new Range will be the same as this Range. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3108) Make Range and Bounds objects client-safe
[ https://issues.apache.org/jira/browse/CASSANDRA-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097282#comment-13097282 ] Mck SembWever edited comment on CASSANDRA-3108 at 9/5/11 7:41 PM: -- You drastically removed the usage of the {{Range(left, right)}} constructor so that even the usage of {{intersectionBothWrapping(..)}} and {{intersectionOneWrapping(..)}} avoids any server-side calls. In CFIF there AFAIK doesn't seem any other limitation to wrapping ranges... was (Author: michaelsembwever): You drastically removed the usage of the {{Range(left, right)}} constructor so that even the usage of {{intersectionBothWrapping(..)}} and {{intersectionOneWrapping(..)}} avoids any server-side calls. It CFIF there AFAIK doesn't seem any other limitation to wrapping ranges... Make Range and Bounds objects client-safe - Key: CASSANDRA-3108 URL: https://issues.apache.org/jira/browse/CASSANDRA-3108 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Mck SembWever Labels: hadoop Fix For: 0.8.5 Attachments: 3108.txt From Mck's comment on CASSANDRA-1125: Something broke here in production once we went out with 0.8.2. It may have been some poor testing, i'm not entirely sure and a little surprised. CFIF:135 breaks because inside dhtRange.intersects(jobRange) there's a call to new Range(token, token) which calls StorageService.getPartitioner() and StorageService is null as we're not inside the server. A quick fix is to change Range:148 from new Range(token, token) to new Range(token, token, partitioner) making the presumption that the partitioner for the new Range will be the same as this Range. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Issue Comment Edited] (CASSANDRA-3108) Make Range and Bounds objects client-safe
[ https://issues.apache.org/jira/browse/CASSANDRA-3108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097282#comment-13097282 ] Mck SembWever edited comment on CASSANDRA-3108 at 9/5/11 7:41 PM: -- You drastically removed the usage of the {{Range(left, right)}} constructor so that even the usage of {{intersectionBothWrapping(..)}} and {{intersectionOneWrapping(..)}} avoids any server-side calls. It CFIF there AFAIK doesn't seem any other limitation to wrapping ranges... was (Author: michaelsembwever): You drastically removed the usage of the {{Range(left, right)}} constructor so that even the usage of {{intersectionBothWrapping(..)}} and {{intersectionOneWrapping(..)}} avoids any server-side calls. Make Range and Bounds objects client-safe - Key: CASSANDRA-3108 URL: https://issues.apache.org/jira/browse/CASSANDRA-3108 Project: Cassandra Issue Type: Bug Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Mck SembWever Labels: hadoop Fix For: 0.8.5 Attachments: 3108.txt From Mck's comment on CASSANDRA-1125: Something broke here in production once we went out with 0.8.2. It may have been some poor testing, i'm not entirely sure and a little surprised. CFIF:135 breaks because inside dhtRange.intersects(jobRange) there's a call to new Range(token, token) which calls StorageService.getPartitioner() and StorageService is null as we're not inside the server. A quick fix is to change Range:148 from new Range(token, token) to new Range(token, token, partitioner) making the presumption that the partitioner for the new Range will be the same as this Range. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1165405 - in /cassandra/branches/cassandra-0.8: CHANGES.txt src/java/org/apache/cassandra/locator/PropertyFileSnitch.java
Author: jbellis Date: Mon Sep 5 19:58:27 2011 New Revision: 1165405 URL: http://svn.apache.org/viewvc?rev=1165405view=rev Log: avoid trying to watch cassandra-topology.properties when loaded from jar patch by Mck SembWever; reviewed by jbellis for CASSANDRA-3138 Modified: cassandra/branches/cassandra-0.8/CHANGES.txt cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/PropertyFileSnitch.java Modified: cassandra/branches/cassandra-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1165405r1=1165404r2=1165405view=diff == --- cassandra/branches/cassandra-0.8/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.8/CHANGES.txt Mon Sep 5 19:58:27 2011 @@ -1,3 +1,8 @@ +0.8.6 + * avoid trying to watch cassandra-topology.properties when loaded from jar + (CASSANDRA-3138) + + 0.8.5 * fix NPE when encryption_options is unspecified (CASSANDRA-3007) * include column name in validation failure exceptions (CASSANDRA-2849) Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/PropertyFileSnitch.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/PropertyFileSnitch.java?rev=1165405r1=1165404r2=1165405view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/PropertyFileSnitch.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/PropertyFileSnitch.java Mon Sep 5 19:58:27 2011 @@ -58,14 +58,22 @@ public class PropertyFileSnitch extends public PropertyFileSnitch() throws ConfigurationException { reloadConfiguration(); -Runnable runnable = new WrappedRunnable() +try { -protected void runMayThrow() throws ConfigurationException +FBUtilities.resourceToFile(RACK_PROPERTY_FILENAME); +Runnable runnable = new WrappedRunnable() { -reloadConfiguration(); -} -}; -ResourceWatcher.watch(RACK_PROPERTY_FILENAME, runnable, 60 * 1000); +protected void runMayThrow() throws ConfigurationException +{ +reloadConfiguration(); +} +}; +ResourceWatcher.watch(RACK_PROPERTY_FILENAME, runnable, 60 * 1000); +} +catch (ConfigurationException ex) +{ +logger.debug(RACK_PROPERTY_FILENAME + found, but does not look like a plain file. Will not watch it for changes); +} } /**
[jira] [Resolved] (CASSANDRA-3138) PropertyFileSnitch's ResourceWatcher fails because it uses FBUtilities.resourceToFile(..) while PropertyFileSnitch uses classloader.getResourceAsStream(..)
[ https://issues.apache.org/jira/browse/CASSANDRA-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis resolved CASSANDRA-3138. --- Resolution: Fixed Reviewer: jbellis committed (w/ logged message at debug), thanks! PropertyFileSnitch's ResourceWatcher fails because it uses FBUtilities.resourceToFile(..) while PropertyFileSnitch uses classloader.getResourceAsStream(..) --- Key: CASSANDRA-3138 URL: https://issues.apache.org/jira/browse/CASSANDRA-3138 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 Reporter: Mck SembWever Assignee: Mck SembWever Priority: Minor Fix For: 0.8.6 Attachments: CASSANDRA-3138.patch Resource files are not necessarily plain files. They could be inside a jar file. See CASSANDRA-2036 This will cause {noformat}RROR 24:15,806 ResourceWatcher$WatchedResource: Timed run of class org.apache.cassandra.locator.PropertyFileSnitch$1 failed. org.apache.cassandra.config.ConfigurationException: unable to locate cassandra-topology.properties at org.apache.cassandra.utils.FBUtilities.resourceToFile(FBUtilities.java:467) at org.apache.cassandra.utils.ResourceWatcher$WatchedResource.run(ResourceWatcher.java:57) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662){noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3138) PropertyFileSnitch's ResourceWatcher fails because it uses FBUtilities.resourceToFile(..) while PropertyFileSnitch uses classloader.getResourceAsStream(..)
[ https://issues.apache.org/jira/browse/CASSANDRA-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-3138: -- Priority: Minor (was: Major) Affects Version/s: (was: 0.8.4) 0.7.0 Fix Version/s: 0.8.6 PropertyFileSnitch's ResourceWatcher fails because it uses FBUtilities.resourceToFile(..) while PropertyFileSnitch uses classloader.getResourceAsStream(..) --- Key: CASSANDRA-3138 URL: https://issues.apache.org/jira/browse/CASSANDRA-3138 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 Reporter: Mck SembWever Assignee: Mck SembWever Priority: Minor Fix For: 0.8.6 Attachments: CASSANDRA-3138.patch Resource files are not necessarily plain files. They could be inside a jar file. See CASSANDRA-2036 This will cause {noformat}RROR 24:15,806 ResourceWatcher$WatchedResource: Timed run of class org.apache.cassandra.locator.PropertyFileSnitch$1 failed. org.apache.cassandra.config.ConfigurationException: unable to locate cassandra-topology.properties at org.apache.cassandra.utils.FBUtilities.resourceToFile(FBUtilities.java:467) at org.apache.cassandra.utils.ResourceWatcher$WatchedResource.run(ResourceWatcher.java:57) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662){noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3138) PropertyFileSnitch's ResourceWatcher fails because it uses FBUtilities.resourceToFile(..) while PropertyFileSnitch uses classloader.getResourceAsStream(..)
[ https://issues.apache.org/jira/browse/CASSANDRA-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097309#comment-13097309 ] Hudson commented on CASSANDRA-3138: --- Integrated in Cassandra-0.8 #314 (See [https://builds.apache.org/job/Cassandra-0.8/314/]) avoid trying to watch cassandra-topology.properties when loaded from jar patch by Mck SembWever; reviewed by jbellis for CASSANDRA-3138 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1165405 Files : * /cassandra/branches/cassandra-0.8/CHANGES.txt * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/locator/PropertyFileSnitch.java PropertyFileSnitch's ResourceWatcher fails because it uses FBUtilities.resourceToFile(..) while PropertyFileSnitch uses classloader.getResourceAsStream(..) --- Key: CASSANDRA-3138 URL: https://issues.apache.org/jira/browse/CASSANDRA-3138 Project: Cassandra Issue Type: Bug Components: Core Affects Versions: 0.7.0 Reporter: Mck SembWever Assignee: Mck SembWever Priority: Minor Fix For: 0.8.6 Attachments: CASSANDRA-3138.patch Resource files are not necessarily plain files. They could be inside a jar file. See CASSANDRA-2036 This will cause {noformat}RROR 24:15,806 ResourceWatcher$WatchedResource: Timed run of class org.apache.cassandra.locator.PropertyFileSnitch$1 failed. org.apache.cassandra.config.ConfigurationException: unable to locate cassandra-topology.properties at org.apache.cassandra.utils.FBUtilities.resourceToFile(FBUtilities.java:467) at org.apache.cassandra.utils.ResourceWatcher$WatchedResource.run(ResourceWatcher.java:57) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRunAndReset(FutureTask.java:317) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:150) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$101(ScheduledThreadPoolExecutor.java:98) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.runPeriodic(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:204) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662){noformat} -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3141) SSTableSimpleUnsortedWriter call to ColumnFamily.serializedSize iterate through the whole columns
SSTableSimpleUnsortedWriter call to ColumnFamily.serializedSize iterate through the whole columns - Key: CASSANDRA-3141 URL: https://issues.apache.org/jira/browse/CASSANDRA-3141 Project: Cassandra Issue Type: Improvement Components: Core Affects Versions: 0.8.3 Reporter: Benoit Perroud Priority: Minor Every time newRow is called, serializedSize iterate through all the columns to compute the size. Once 1'000'000 columns exist in the CF, it becomes painfull to do at every iteration the same computation. Caching the size and incrementing when a Column is added could be an option. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3139) Prevent users from creating keyspaces with LocalStrategy replication
[ https://issues.apache.org/jira/browse/CASSANDRA-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-3139: --- Attachment: CASSANDRA-3139.patch Prevent users from creating keyspaces with LocalStrategy replication Key: CASSANDRA-3139 URL: https://issues.apache.org/jira/browse/CASSANDRA-3139 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.6 Attachments: CASSANDRA-3139.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3139) Prevent users from creating keyspaces with LocalStrategy replication
[ https://issues.apache.org/jira/browse/CASSANDRA-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097323#comment-13097323 ] Jonathan Ellis commented on CASSANDRA-3139: --- LocalStrategy isn't deprecated; it's just reserved for internal use. Prevent users from creating keyspaces with LocalStrategy replication Key: CASSANDRA-3139 URL: https://issues.apache.org/jira/browse/CASSANDRA-3139 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.6 Attachments: CASSANDRA-3139.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3139) Prevent users from creating keyspaces with LocalStrategy replication
[ https://issues.apache.org/jira/browse/CASSANDRA-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-3139: --- Attachment: (was: CASSANDRA-3139.patch) Prevent users from creating keyspaces with LocalStrategy replication Key: CASSANDRA-3139 URL: https://issues.apache.org/jira/browse/CASSANDRA-3139 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.6 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-3139) Prevent users from creating keyspaces with LocalStrategy replication
[ https://issues.apache.org/jira/browse/CASSANDRA-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pavel Yaskevich updated CASSANDRA-3139: --- Attachment: CASSANDRA-3139.patch error message is fixed. Prevent users from creating keyspaces with LocalStrategy replication Key: CASSANDRA-3139 URL: https://issues.apache.org/jira/browse/CASSANDRA-3139 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.6 Attachments: CASSANDRA-3139.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3139) Prevent users from creating keyspaces with LocalStrategy replication
[ https://issues.apache.org/jira/browse/CASSANDRA-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097561#comment-13097561 ] Jonathan Ellis commented on CASSANDRA-3139: --- +1 Prevent users from creating keyspaces with LocalStrategy replication Key: CASSANDRA-3139 URL: https://issues.apache.org/jira/browse/CASSANDRA-3139 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.6 Attachments: CASSANDRA-3139.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
svn commit: r1165438 - in /cassandra/branches/cassandra-0.8: CHANGES.txt src/java/org/apache/cassandra/thrift/ThriftValidation.java test/unit/org/apache/cassandra/cli/CliTest.java test/unit/org/apache
Author: xedin Date: Mon Sep 5 22:21:01 2011 New Revision: 1165438 URL: http://svn.apache.org/viewvc?rev=1165438view=rev Log: Prevent users from creating keyspaces with LocalStrategy replication patch by Pavel Yaskevich; reviewed by Jonathan Ellis for CASSANDRA-3139 Modified: cassandra/branches/cassandra-0.8/CHANGES.txt cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/cli/CliTest.java cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/thrift/ThriftValidationTest.java Modified: cassandra/branches/cassandra-0.8/CHANGES.txt URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/CHANGES.txt?rev=1165438r1=1165437r2=1165438view=diff == --- cassandra/branches/cassandra-0.8/CHANGES.txt (original) +++ cassandra/branches/cassandra-0.8/CHANGES.txt Mon Sep 5 22:21:01 2011 @@ -1,7 +1,8 @@ 0.8.6 * avoid trying to watch cassandra-topology.properties when loaded from jar (CASSANDRA-3138) - + * prevent users from creating keyspaces with LocalStrategy replication + (CASSANDRA-3139) 0.8.5 * fix NPE when encryption_options is unspecified (CASSANDRA-3007) Modified: cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java?rev=1165438r1=1165437r2=1165438view=diff == --- cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java (original) +++ cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java Mon Sep 5 22:21:01 2011 @@ -27,6 +27,7 @@ import org.slf4j.Logger; import org.slf4j.LoggerFactory; import org.apache.cassandra.config.*; +import org.apache.cassandra.locator.*; import org.apache.cassandra.db.*; import org.apache.cassandra.db.marshal.AbstractType; import org.apache.cassandra.db.marshal.AsciiType; @@ -36,10 +37,6 @@ import org.apache.cassandra.db.migration import org.apache.cassandra.dht.IPartitioner; import org.apache.cassandra.dht.RandomPartitioner; import org.apache.cassandra.dht.Token; -import org.apache.cassandra.locator.AbstractReplicationStrategy; -import org.apache.cassandra.locator.IEndpointSnitch; -import org.apache.cassandra.locator.NetworkTopologyStrategy; -import org.apache.cassandra.locator.TokenMetadata; import org.apache.cassandra.service.StorageService; import org.apache.cassandra.utils.ByteBufferUtil; import org.apache.cassandra.utils.FBUtilities; @@ -671,6 +668,10 @@ public class ThriftValidation TokenMetadata tmd = StorageService.instance.getTokenMetadata(); IEndpointSnitch eps = DatabaseDescriptor.getEndpointSnitch(); Class? extends AbstractReplicationStrategy cls = AbstractReplicationStrategy.getClass(ks_def.strategy_class); + +if (cls.equals(LocalStrategy.class)) +throw new ConfigurationException(Unable to use given strategy class: LocalStrategy is reserved for internal use.); + AbstractReplicationStrategy.createReplicationStrategy(ks_def.name, cls, tmd, eps, options); } Modified: cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/cli/CliTest.java URL: http://svn.apache.org/viewvc/cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/cli/CliTest.java?rev=1165438r1=1165437r2=1165438view=diff == --- cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/cli/CliTest.java (original) +++ cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/cli/CliTest.java Mon Sep 5 22:21:01 2011 @@ -124,8 +124,7 @@ public class CliTest extends CleanupHelp drop index on '123'.617070;, drop index on '123'.'-617071';, drop index on CF3.'big world';, -update keyspace TestKeySpace with placement_strategy='org.apache.cassandra.locator.LocalStrategy' and durable_writes = false;, -update keyspace TestKeySpace with strategy_options=[{DC1:3, DC2:4, DC5:1}];, +update keyspace TestKeySpace with durable_writes = false;, assume 123 comparator as utf8;, assume 123 sub_comparator as integer;, assume 123 validator as lexicaluuid;, @@ -166,6 +165,8 @@ public class CliTest extends CleanupHelp get myCF['key']['scName'], assume CF3 keys as utf8;, use TestKEYSpace;, +update keyspace TestKeySpace with placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy';, +update keyspace TestKeySpace with strategy_options=[{DC1:3, DC2:4, DC5:1}];, describe cluster;, help describe cluster;, show cluster name, Modified:
[Cassandra Wiki] Update of Operations by vijay2win
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The Operations page has been changed by vijay2win: http://wiki.apache.org/cassandra/Operations?action=diffrev1=95rev2=96 The status of move and balancing operations can be monitored using `nodetool` with the `netstat` argument. (Cassandra 0.6.* and lower use the `streams` argument). + === Replacing a Dead Node (with same token): === + + Since Cassandra 1.0 we can replace an existing node with a new node using the property cassandra.replace_token=Token, This property can be set using -D option while starting cassandra demon process. + + (Note:This property will be taken into effect only when the node doesn't have any data in it, You might want to empty the data dir if you want to force the node replace.) + + You must use this property when replacing a dead node (If tried to replace an existing live node, the bootstrapping node will throw a Exception). + The token used via this property must be part of the ring and the node have died due to various reasons. + + Once this Property is enabled the node starts in a hibernate state, during which all the other nodes will see this node to be down. The new node will now start to bootstrap the data from the rest of the nodes in the cluster (Main difference between normal bootstrapping of a new node is that this new node will not accept any writes during this phase). Once the bootstrapping is complete the node will be marked UP, we rely on the hinted handoff's for making this node consistent (Since we don't accept writes since the start of the bootstrap). + + Note: We Strongly suggest to repair the node once the bootstrap is completed, because Hinted handoff is a best effort and not a guarantee. + == Consistency == Cassandra allows clients to specify the desired consistency level on reads and writes. (See [[API]].) If R + W N, where R, W, and N are respectively the read replica count, the write replica count, and the replication factor, all client reads will see the most recent write. Otherwise, readers '''may''' see older versions, for periods of typically a few ms; this is called eventual consistency. See http://www.allthingsdistributed.com/2008/12/eventually_consistent.html and http://queue.acm.org/detail.cfm?id=1466448 for more. @@ -225, +238 @@ NOTE: Starting with version 0.7, json2sstable and sstable2json must be run in such a way that the schema can be loaded from system tables. This means that cassandra.yaml must be found in the classpath and refer to valid storage directories. == Monitoring == - Running `nodetool cfstats` can provide an overview of each Column Family, and important metrics to graph your cluster. Cassandra also exposes internal metrics as JMX data. This is a common standard in the JVM world; OpenNMS, Nagios, and Munin at least offer some level of JMX support. For a non-stupid JMX plugin for Munin check out https://github.com/tcurdt/jmx2munin + Running `nodetool cfstats` can provide an overview of each Column Family, and important metrics to graph your cluster. Cassandra also exposes internal metrics as JMX data. This is a common standard in the JVM world; OpenNMS, Nagios, and Munin at least offer some level of JMX support. For a non-stupid JMX plugin for Munin check out https://github.com/tcurdt/jmx2munin - The specifics of the JMX Interface are documented at JmxInterface. + The specifics of the JMX Interface are documented at JmxInterface. Some folks prefer having to deal with non-jmx clients, there is a JMX-to-REST bridge available at http://code.google.com/p/polarrose-jmx-rest-bridge/ Bridging to SNMP is a bit more work but can be done with https://github.com/tcurdt/jmx2snmp
[jira] [Commented] (CASSANDRA-3139) Prevent users from creating keyspaces with LocalStrategy replication
[ https://issues.apache.org/jira/browse/CASSANDRA-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097626#comment-13097626 ] Hudson commented on CASSANDRA-3139: --- Integrated in Cassandra-0.8 #315 (See [https://builds.apache.org/job/Cassandra-0.8/315/]) Prevent users from creating keyspaces with LocalStrategy replication patch by Pavel Yaskevich; reviewed by Jonathan Ellis for CASSANDRA-3139 xedin : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1165438 Files : * /cassandra/branches/cassandra-0.8/CHANGES.txt * /cassandra/branches/cassandra-0.8/src/java/org/apache/cassandra/thrift/ThriftValidation.java * /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/cli/CliTest.java * /cassandra/branches/cassandra-0.8/test/unit/org/apache/cassandra/thrift/ThriftValidationTest.java Prevent users from creating keyspaces with LocalStrategy replication Key: CASSANDRA-3139 URL: https://issues.apache.org/jira/browse/CASSANDRA-3139 Project: Cassandra Issue Type: Bug Components: API Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Priority: Minor Fix For: 0.8.6 Attachments: CASSANDRA-3139.patch -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2434) node bootstrapping can violate consistency
[ https://issues.apache.org/jira/browse/CASSANDRA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097625#comment-13097625 ] Zhu Han commented on CASSANDRA-2434: bq. Also if only one node is down you should still be able to read/write at quorum and achieve consistency I suppose quorum read plus quorum write should provide monotonic read consistency. [1] Supposing quorum write on key1 hits node A and node B, not on node C due to temporal network partition. After that node B is replaced by node D since it is down, and node D streams data from node C. If the following quorum read on key1 hits only node C and node D, the monotonic consistency is violated. This is rare but not unrealistic, especially when hint handoff is disabled. Maybe it is more resonable to give the admin an option, to specify that the bootstrapped node should not accept any read request until the admin turn it on manually. So the admin can start a manual repair if he wants to assure everything goes fine. [1]http://www.allthingsdistributed.com/2007/12/eventually_consistent.html node bootstrapping can violate consistency -- Key: CASSANDRA-2434 URL: https://issues.apache.org/jira/browse/CASSANDRA-2434 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: paul cannon Fix For: 1.1 Attachments: 2434.patch.txt My reading (a while ago) of the code indicates that there is no logic involved during bootstrapping that avoids consistency level violations. If I recall correctly it just grabs neighbors that are currently up. There are at least two issues I have with this behavior: * If I have a cluster where I have applications relying on QUORUM with RF=3, and bootstrapping complete based on only one node, I have just violated the supposedly guaranteed consistency semantics of the cluster. * Nodes can flap up and down at any time, so even if a human takes care to look at which nodes are up and things about it carefully before bootstrapping, there's no guarantee. A complication is that not only does it depend on use-case where this is an issue (if all you ever do you do at CL.ONE, it's fine); even in a cluster which is otherwise used for QUORUM operations you may wish to accept less-than-quorum nodes during bootstrap in various emergency situations. A potential easy fix is to have bootstrap take an argument which is the number of hosts to bootstrap from, or to assume QUORUM if none is given. (A related concern is bootstrapping across data centers. You may *want* to bootstrap to a local node and then do a repair to avoid sending loads of data across DC:s while still achieving consistency. Or even if you don't care about the consistency issues, I don't think there is currently a way to bootstrap from local nodes only.) Thoughts? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3140) Expose server, api versions to CQL
[ https://issues.apache.org/jira/browse/CASSANDRA-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097660#comment-13097660 ] Rick Shaw commented on CASSANDRA-3140: -- +1 for this approach. A generalized interface for methods that can occur as a pseudo-column would worth discussing. Expose server, api versions to CQL -- Key: CASSANDRA-3140 URL: https://issues.apache.org/jira/browse/CASSANDRA-3140 Project: Cassandra Issue Type: New Feature Reporter: Jonathan Ellis Priority: Minor Fix For: 1.0 Need to expose the CQL api version; might as well include the server version while we're at it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-3118) nodetool can not decommission a node
[ https://issues.apache.org/jira/browse/CASSANDRA-3118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097665#comment-13097665 ] deng commented on CASSANDRA-3118: - but there is another problem. when the nodeA is decommission and I changed the seeds from 100.86.12.224 to 127.0.0.1 in the file cassandra.yaml. the listen_address and rpc_address are still 100.86.17.9. I restarted the nodeA server but the nodeA automatically join the cluster ,even if i install new cassandra0.8.4. Why? the nodeA has been decommissioned in the cluster. I changed the listen_address and rpc_address from 100.86.17.9 to localhost. I restarted the nodeA server the nodeA can not automatically join the cluster . why? nodetool can not decommission a node -- Key: CASSANDRA-3118 URL: https://issues.apache.org/jira/browse/CASSANDRA-3118 Project: Cassandra Issue Type: Bug Components: Tools Affects Versions: 0.8.4 Environment: Cassandra0.84 Reporter: deng Attachments: 3118-debug.txt when i use nodetool ring and get the result ,and than i want to decommission 100.86.17.90 node ,but i get the error: [root@ip bin]# ./nodetool -h10.86.12.225 ring Address DC RackStatus State LoadOwns Token 154562542458917734942660802527609328132 100.86.17.90 datacenter1 rack1 Up Leaving 1.08 MB 11.21% 3493450320433654773610109291263389161 100.86.12.225datacenter1 rack1 Up Normal 558.25 MB 14.25% 27742979166206700793970535921354744095 100.86.12.224datacenter1 rack1 Up Normal 5.01 GB 6.58% 38945137636148605752956920077679425910 ERROR: root@ip bin]# ./nodetool -h100.86.17.90 decommission Exception in thread main java.lang.UnsupportedOperationException at java.util.AbstractList.remove(AbstractList.java:144) at java.util.AbstractList$Itr.remove(AbstractList.java:360) at java.util.AbstractCollection.removeAll(AbstractCollection.java:337) at org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1041) at org.apache.cassandra.service.StorageService.calculatePendingRanges(StorageService.java:1006) at org.apache.cassandra.service.StorageService.handleStateLeaving(StorageService.java:877) at org.apache.cassandra.service.StorageService.onChange(StorageService.java:732) at org.apache.cassandra.gms.Gossiper.doNotifications(Gossiper.java:839) at org.apache.cassandra.gms.Gossiper.addLocalApplicationState(Gossiper.java:986) at org.apache.cassandra.service.StorageService.startLeaving(StorageService.java:1836) at org.apache.cassandra.service.StorageService.decommission(StorageService.java:1855) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1426) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1264) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1359) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at
[Cassandra Wiki] Update of Operations by JonathanEllis
Dear Wiki user, You have subscribed to a wiki page or wiki category on Cassandra Wiki for change notification. The Operations page has been changed by JonathanEllis: http://wiki.apache.org/cassandra/Operations?action=diffrev1=96rev2=97 Using a strong hash function means !RandomPartitioner keys will, on average, be evenly spread across the Token space, but you can still have imbalances if your Tokens do not divide up the range evenly, so you should specify !InitialToken to your first nodes as `i * (2**127 / N)` for i = 0 .. N-1. In Cassandra 0.7, you should specify `initial_token` in `cassandra.yaml`. With !NetworkTopologyStrategy, you should calculate the tokens the nodes in each DC independently. Tokens still needed to be unique, so you can add 1 to the tokens in the 2nd DC, add 2 in the 3rd, and so on. Thus, for a 4-node cluster in 2 datacenters, you would have + {{{ DC1 node 1 = 0 @@ -33, +34 @@ node 3 = 1 node 4 = 85070591730234615865843651857942052865 }}} - - If you happen to have the same number of nodes in each data center, you can also alternate data centers when assigning tokens: + {{{ [DC1] node 1 = 0 [DC2] node 2 = 42535295865117307932921825928971026432 [DC1] node 3 = 85070591730234615865843651857942052864 [DC2] node 4 = 127605887595351923798765477786913079296 }}} - With order preserving partitioners, your key distribution will be application-dependent. You should still take your best guess at specifying initial tokens (guided by sampling actual data, if possible), but you will be more dependent on active load balancing (see below) and/or adding new nodes to hot spots. Once data is placed on the cluster, the partitioner may not be changed without wiping and starting over. @@ -127, +126 @@ The status of move and balancing operations can be monitored using `nodetool` with the `netstat` argument. (Cassandra 0.6.* and lower use the `streams` argument). - === Replacing a Dead Node (with same token): === + === Replacing a Dead Node === - - Since Cassandra 1.0 we can replace an existing node with a new node using the property cassandra.replace_token=Token, This property can be set using -D option while starting cassandra demon process. + Since Cassandra 1.0 we can replace a dead node with a new one using the property cassandra.replace_token=Token, This property can be set using -D option while starting cassandra demon process. (Note:This property will be taken into effect only when the node doesn't have any data in it, You might want to empty the data dir if you want to force the node replace.) + You must use this property when replacing a dead node (If tried to replace an existing live node, the bootstrapping node will throw a Exception). The token used via this property must be part of the ring and the node have died due to various reasons. - You must use this property when replacing a dead node (If tried to replace an existing live node, the bootstrapping node will throw a Exception). - The token used via this property must be part of the ring and the node have died due to various reasons. Once this Property is enabled the node starts in a hibernate state, during which all the other nodes will see this node to be down. The new node will now start to bootstrap the data from the rest of the nodes in the cluster (Main difference between normal bootstrapping of a new node is that this new node will not accept any writes during this phase). Once the bootstrapping is complete the node will be marked UP, we rely on the hinted handoff's for making this node consistent (Since we don't accept writes since the start of the bootstrap). @@ -238, +235 @@ NOTE: Starting with version 0.7, json2sstable and sstable2json must be run in such a way that the schema can be loaded from system tables. This means that cassandra.yaml must be found in the classpath and refer to valid storage directories. == Monitoring == - Running `nodetool cfstats` can provide an overview of each Column Family, and important metrics to graph your cluster. Cassandra also exposes internal metrics as JMX data. This is a common standard in the JVM world; OpenNMS, Nagios, and Munin at least offer some level of JMX support. For a non-stupid JMX plugin for Munin check out https://github.com/tcurdt/jmx2munin + Running `nodetool cfstats` can provide an overview of each Column Family, and important metrics to graph your cluster. Cassandra also exposes internal metrics as JMX data. This is a common standard in the JVM world; OpenNMS, Nagios, and Munin at least offer some level of JMX support. For a non-stupid JMX plugin for Munin check out https://github.com/tcurdt/jmx2munin The specifics of the JMX Interface are documented at JmxInterface. - The specifics of the JMX Interface are documented at JmxInterface. Some folks prefer having to deal with non-jmx clients, there is a JMX-to-REST bridge available at
[jira] [Commented] (CASSANDRA-3050) Global row cache
[ https://issues.apache.org/jira/browse/CASSANDRA-3050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097685#comment-13097685 ] Jonathan Ellis commented on CASSANDRA-3050: --- Hmm. I guess we could have the cache provider include a sizeInMemory method? For serialized off-heap cache we can just use the FreeableMemory size(). For on-heap cache we can use the serializedSize * liveRatio from the CF's memtable. Global row cache Key: CASSANDRA-3050 URL: https://issues.apache.org/jira/browse/CASSANDRA-3050 Project: Cassandra Issue Type: Improvement Components: Core Reporter: Jonathan Ellis Assignee: Pavel Yaskevich Priority: Minor Fix For: 1.1 Row-cache-per-columnfamily is difficult to configure well as columnfamilies are added, similar to how memtables were difficult pre-CASSANDRA-2006. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2434) node bootstrapping can violate consistency
[ https://issues.apache.org/jira/browse/CASSANDRA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097695#comment-13097695 ] paul cannon commented on CASSANDRA-2434: bq. The suggestion was that if the 'correct' node is down, you can force the bootstrap to complete anyway (probably from the closest node, but that is transparent to the user), but only if the 'correct' node is down. Oh, ok. I misunderstood. This seems reasonable. I'd lean for the more general solution, yeah, but I don't feel very strongly about it. node bootstrapping can violate consistency -- Key: CASSANDRA-2434 URL: https://issues.apache.org/jira/browse/CASSANDRA-2434 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: paul cannon Fix For: 1.1 Attachments: 2434.patch.txt My reading (a while ago) of the code indicates that there is no logic involved during bootstrapping that avoids consistency level violations. If I recall correctly it just grabs neighbors that are currently up. There are at least two issues I have with this behavior: * If I have a cluster where I have applications relying on QUORUM with RF=3, and bootstrapping complete based on only one node, I have just violated the supposedly guaranteed consistency semantics of the cluster. * Nodes can flap up and down at any time, so even if a human takes care to look at which nodes are up and things about it carefully before bootstrapping, there's no guarantee. A complication is that not only does it depend on use-case where this is an issue (if all you ever do you do at CL.ONE, it's fine); even in a cluster which is otherwise used for QUORUM operations you may wish to accept less-than-quorum nodes during bootstrap in various emergency situations. A potential easy fix is to have bootstrap take an argument which is the number of hosts to bootstrap from, or to assume QUORUM if none is given. (A related concern is bootstrapping across data centers. You may *want* to bootstrap to a local node and then do a repair to avoid sending loads of data across DC:s while still achieving consistency. Or even if you don't care about the consistency issues, I don't think there is currently a way to bootstrap from local nodes only.) Thoughts? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-2434) node bootstrapping can violate consistency
[ https://issues.apache.org/jira/browse/CASSANDRA-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097703#comment-13097703 ] Zhu Han commented on CASSANDRA-2434: As peter suggested before, another approach to fix the consistency problem is streaming sstables from all alive peers if the correct node is down. And then leave them to normal compaction. This could be much lightweight than anti-entrophy repair, except the network IO pressure on the bootstrapping node. node bootstrapping can violate consistency -- Key: CASSANDRA-2434 URL: https://issues.apache.org/jira/browse/CASSANDRA-2434 Project: Cassandra Issue Type: Bug Reporter: Peter Schuller Assignee: paul cannon Fix For: 1.1 Attachments: 2434.patch.txt My reading (a while ago) of the code indicates that there is no logic involved during bootstrapping that avoids consistency level violations. If I recall correctly it just grabs neighbors that are currently up. There are at least two issues I have with this behavior: * If I have a cluster where I have applications relying on QUORUM with RF=3, and bootstrapping complete based on only one node, I have just violated the supposedly guaranteed consistency semantics of the cluster. * Nodes can flap up and down at any time, so even if a human takes care to look at which nodes are up and things about it carefully before bootstrapping, there's no guarantee. A complication is that not only does it depend on use-case where this is an issue (if all you ever do you do at CL.ONE, it's fine); even in a cluster which is otherwise used for QUORUM operations you may wish to accept less-than-quorum nodes during bootstrap in various emergency situations. A potential easy fix is to have bootstrap take an argument which is the number of hosts to bootstrap from, or to assume QUORUM if none is given. (A related concern is bootstrapping across data centers. You may *want* to bootstrap to a local node and then do a repair to avoid sending loads of data across DC:s while still achieving consistency. Or even if you don't care about the consistency issues, I don't think there is currently a way to bootstrap from local nodes only.) Thoughts? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (CASSANDRA-3142) CustomTThreadPoolServer should log TTransportException at DEBUG level
CustomTThreadPoolServer should log TTransportException at DEBUG level - Key: CASSANDRA-3142 URL: https://issues.apache.org/jira/browse/CASSANDRA-3142 Project: Cassandra Issue Type: Bug Reporter: Jim Ancona Currently CustomTThreadPoolServer, like the Thrift TThreadPoolServer, silently ignores TTransportException in its run() method. This is appropriate in most cases because TTransportException occurs fairly often when client connections die. However TTransportException is also thrown when TFramedTransport encounters a frame that is larger than thrift_framed_transport_size_in_mb. In that case, silently exiting the run loop leads to a SocketException on the client side which can be both difficult to diagnose, in part because nothing is logged by Cassandra, and high-impact, because the client may respond by marking the server node down and retrying the too-large request on another node, where it also fails. This process repeated leads to the entire cluster being marked down (see https://github.com/rantav/hector/issues/212). I've filed two Thrift issues (https://issues.apache.org/jira/browse/THRIFT-1323 and https://issues.apache.org/jira/browse/THRIFT-1324), but in the meantime, I suggest that CustomTThreadPoolServer log the exception at DEBUG level in order to support easier troubleshooting. I can submit a patch with the added log message. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (CASSANDRA-957) convenience workflow for replacing dead node
[ https://issues.apache.org/jira/browse/CASSANDRA-957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13097708#comment-13097708 ] Hudson commented on CASSANDRA-957: -- Integrated in Cassandra #1076 (See [https://builds.apache.org/job/Cassandra/1076/]) convenience workflow for replacing dead node patch by Vijay; reviewed by Nick Bailey for CASSANDRA-957 jbellis : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1165468 Files : * /cassandra/trunk/NEWS.txt * /cassandra/trunk/src/java/org/apache/cassandra/config/DatabaseDescriptor.java * /cassandra/trunk/src/java/org/apache/cassandra/db/HintedHandOffManager.java * /cassandra/trunk/src/java/org/apache/cassandra/db/RowMutation.java * /cassandra/trunk/src/java/org/apache/cassandra/dht/BootStrapper.java * /cassandra/trunk/src/java/org/apache/cassandra/gms/EndpointState.java * /cassandra/trunk/src/java/org/apache/cassandra/gms/Gossiper.java * /cassandra/trunk/src/java/org/apache/cassandra/gms/VersionedValue.java * /cassandra/trunk/src/java/org/apache/cassandra/service/LoadBroadcaster.java * /cassandra/trunk/src/java/org/apache/cassandra/service/MigrationManager.java * /cassandra/trunk/src/java/org/apache/cassandra/service/StorageProxy.java * /cassandra/trunk/src/java/org/apache/cassandra/service/StorageService.java convenience workflow for replacing dead node Key: CASSANDRA-957 URL: https://issues.apache.org/jira/browse/CASSANDRA-957 Project: Cassandra Issue Type: Wish Components: Core, Tools Affects Versions: 0.8.2 Reporter: Jonathan Ellis Assignee: Vijay Fix For: 1.0 Attachments: 0001-Support-bringing-back-a-node-to-the-cluster-that-exi.patch, 0001-adding-NEWS.patch, 0001-support-for-replace-token-v3.patch, 0001-support-token-replace-v4.patch, 0001-support-token-replace-v5.patch, 0001-support-token-replace-v6.patch, 0001-support-token-replace-v7.patch, 0002-Do-not-include-local-node-when-computing-workMap.patch, 0002-hints-on-token-than-ip-v4.patch, 0002-hints-on-token-than-ip-v5.patch, 0002-hints-on-token-than-ip-v6.patch, 0002-upport-for-hints-on-token-v3.patch Original Estimate: 24h Remaining Estimate: 24h Replacing a dead node with a new one is a common operation, but nodetool removetoken followed by bootstrap is inefficient (re-replicating data first to the remaining nodes, then to the new one) and manually bootstrapping to a token just less than the old one's, followed by nodetool removetoken is slightly painful and prone to manual errors. First question: how would you expose this in our tool ecosystem? It needs to be a startup-time option to the new node, so it can't be nodetool, and messing with the config xml definitely takes the convenience out. A one-off -DreplaceToken=XXY argument? -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-2936) improve dependency situation between JDBC driver and Cassandra
[ https://issues.apache.org/jira/browse/CASSANDRA-2936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-2936: -- Attachment: 2936-cleanup.txt Ugh, I wish we hadn't touched the AbstractType-AbstractTerm refactor. It doesn't improve things from the dependency standpoint (the latter still depends on the former) and we should be avoiding 11th hour refactors like this where possible (e.g. this screwed CASSANDRA-2734 all to hell). Having come this far, though, I propose the attached patch: - removes ATerm.isCommutative, which is unused and likely to remain so (commutativity is an internal property of counters) - removes AType.toString, which is unused outside of client code, which leaves us with a single-direction dependency instead of bidirectional I further propose renaming AbstractTerm to AbstractJdbcType, and LongTerm, IntegerTerm, etc., to JdbcLong, JdbcInteger, etc., both on semantic grounds (a term implies a concrete use in a parse tree or statement, not a generic type) and pedantic (it's unfortunate that the CamelCase abbreviations of *Type and *Term are identical). improve dependency situation between JDBC driver and Cassandra -- Key: CASSANDRA-2936 URL: https://issues.apache.org/jira/browse/CASSANDRA-2936 Project: Cassandra Issue Type: Improvement Components: API, Core Affects Versions: 0.8.1 Reporter: Eric Evans Assignee: Eric Evans Priority: Minor Labels: cql Fix For: 1.0 Attachments: 2936-cleanup.txt, v1-0001-CASSANDRA-2936-rename-cookie-jar-clientutil.txt, v3-0001-CASSANDRA-2936-create-package-for-CQL-term-marshaling.txt, v3-0002-convert-drivers-and-tests-to-o.a.c.cql.term.txt, v3-0003-remove-extraneous-methods-from-o.a.c.db.marshal-classe.txt, v3-0004-make-better-reuse-of-new-classes.txt, v3-0005-create-jar-file.txt The JDBC jar currently depends on the {{apache-cassandra-$version}} jar, despite the fact that it only (directly) uses a handful of Cassandra's classes. In a perfect world, we'd break those classes out into their own jar which both the JDBC driver and Cassandra (ala {{apache-cassandra-$version.jar}}) could depend on. However, the classes used directly don't fall out to anything that makes much sense organizationally (short of creating a {{apache-cassandra-misc-$version.jar}}), and the situation only gets worse when you take into account all of the transitive dependencies. See CASSANDRA-2761 for more background, in particular ([this|https://issues.apache.org/jira/browse/CASSANDRA-2761?focusedCommentId=13048734page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13048734] and [this|https://issues.apache.org/jira/browse/CASSANDRA-2761?focusedCommentId=13050884page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13050884]) -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira