Re: Unbalanced ring in Cassandra 0.8.4
No. Cleanup will scan each sstable to remove data that is no longer owned by that specific node. It won't compact the sstables together however. On Tue, Jun 19, 2012 at 11:11 PM, Raj N wrote: > But wont that also run a major compaction which is not recommended anymore. > > -Raj > > > On Sun, Jun 17, 2012 at 11:58 PM, aaron morton > wrote: >> >> Assuming you have been running repair, it' can't hurt. >> >> Cheers >> >> - >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 17/06/2012, at 4:06 AM, Raj N wrote: >> >> Nick, do you think I should still run cleanup on the first node. >> >> -Rajesh >> >> On Fri, Jun 15, 2012 at 3:47 PM, Raj N wrote: >>> >>> I did run nodetool move. But that was when I was setting up the cluster >>> which means I didn't have any data at that time. >>> >>> -Raj >>> >>> >>> On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey wrote: Did you start all your nodes at the correct tokens or did you balance by moving them? Moving nodes around won't delete unneeded data after the move is done. Try running 'nodetool cleanup' on all of your nodes. On Fri, Jun 15, 2012 at 12:24 PM, Raj N wrote: > Actually I am not worried about the percentage. Its the data I am > concerned > about. Look at the first node. It has 102.07GB data. And the other > nodes > have around 60 GB(one has 69, but lets ignore that one). I am not > understanding why the first node has almost double the data. > > Thanks > -Raj > > > On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey > wrote: >> >> This is just a known problem with the nodetool output and multiple >> DCs. Your configuration is correct. The problem with nodetool is >> fixed >> in 1.1.1 >> >> https://issues.apache.org/jira/browse/CASSANDRA-3412 >> >> On Fri, Jun 15, 2012 at 9:59 AM, Raj N >> wrote: >> > Hi experts, >> > I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have >> > assigned >> > tokens using the first strategy(adding 1) mentioned here - >> > >> > http://wiki.apache.org/cassandra/Operations?#Token_selection >> > >> > But when I run nodetool ring on my cluster, this is the result I >> > get - >> > >> > Address DC Rack Status State Load Owns Token >> > >> > 113427455640312814857969558651062452225 >> > 172.17.72.91 DC1 RAC13 Up Normal 102.07 GB 33.33% 0 >> > 45.10.80.144 DC2 RAC5 Up Normal 59.1 GB 0.00% 1 >> > 172.17.72.93 DC1 RAC18 Up Normal 59.57 GB 33.33% >> > 56713727820156407428984779325531226112 >> > 45.10.80.146 DC2 RAC7 Up Normal 59.64 GB 0.00% >> > 56713727820156407428984779325531226113 >> > 172.17.72.95 DC1 RAC19 Up Normal 69.58 GB 33.33% >> > 113427455640312814857969558651062452224 >> > 45.10.80.148 DC2 RAC9 Up Normal 59.31 GB 0.00% >> > 113427455640312814857969558651062452225 >> > >> > >> > As you can see the first node has considerably more load than the >> > others(almost double) which is surprising since all these are >> > replicas >> > of >> > each other. I am running Cassandra 0.8.4. Is there an explanation >> > for >> > this >> > behaviour? >> > Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be >> > the >> > cause for this? >> > >> > Thanks >> > -Raj > > >>> >>> >> >> >
Re: Unbalanced ring in Cassandra 0.8.4
But wont that also run a major compaction which is not recommended anymore. -Raj On Sun, Jun 17, 2012 at 11:58 PM, aaron morton wrote: > Assuming you have been running repair, it' can't hurt. > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 17/06/2012, at 4:06 AM, Raj N wrote: > > Nick, do you think I should still run cleanup on the first node. > > -Rajesh > > On Fri, Jun 15, 2012 at 3:47 PM, Raj N wrote: > >> I did run nodetool move. But that was when I was setting up the cluster >> which means I didn't have any data at that time. >> >> -Raj >> >> >> On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey wrote: >> >>> Did you start all your nodes at the correct tokens or did you balance >>> by moving them? Moving nodes around won't delete unneeded data after >>> the move is done. >>> >>> Try running 'nodetool cleanup' on all of your nodes. >>> >>> On Fri, Jun 15, 2012 at 12:24 PM, Raj N wrote: >>> > Actually I am not worried about the percentage. Its the data I am >>> concerned >>> > about. Look at the first node. It has 102.07GB data. And the other >>> nodes >>> > have around 60 GB(one has 69, but lets ignore that one). I am not >>> > understanding why the first node has almost double the data. >>> > >>> > Thanks >>> > -Raj >>> > >>> > >>> > On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey >>> wrote: >>> >> >>> >> This is just a known problem with the nodetool output and multiple >>> >> DCs. Your configuration is correct. The problem with nodetool is fixed >>> >> in 1.1.1 >>> >> >>> >> https://issues.apache.org/jira/browse/CASSANDRA-3412 >>> >> >>> >> On Fri, Jun 15, 2012 at 9:59 AM, Raj N >>> wrote: >>> >> > Hi experts, >>> >> > I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have >>> assigned >>> >> > tokens using the first strategy(adding 1) mentioned here - >>> >> > >>> >> > http://wiki.apache.org/cassandra/Operations?#Token_selection >>> >> > >>> >> > But when I run nodetool ring on my cluster, this is the result I >>> get - >>> >> > >>> >> > Address DC Rack Status State LoadOwnsToken >>> >> > >>> >> > 113427455640312814857969558651062452225 >>> >> > 172.17.72.91DC1 RAC13 Up Normal 102.07 GB 33.33% 0 >>> >> > 45.10.80.144DC2 RAC5 Up Normal 59.1 GB 0.00% 1 >>> >> > 172.17.72.93DC1 RAC18 Up Normal 59.57 GB33.33% >>> >> > 56713727820156407428984779325531226112 >>> >> > 45.10.80.146DC2 RAC7 Up Normal 59.64 GB0.00% >>> >> > 56713727820156407428984779325531226113 >>> >> > 172.17.72.95DC1 RAC19 Up Normal 69.58 GB33.33% >>> >> > 113427455640312814857969558651062452224 >>> >> > 45.10.80.148DC2 RAC9 Up Normal 59.31 GB0.00% >>> >> > 113427455640312814857969558651062452225 >>> >> > >>> >> > >>> >> > As you can see the first node has considerably more load than the >>> >> > others(almost double) which is surprising since all these are >>> replicas >>> >> > of >>> >> > each other. I am running Cassandra 0.8.4. Is there an explanation >>> for >>> >> > this >>> >> > behaviour? Could >>> https://issues.apache.org/jira/browse/CASSANDRA-2433 be >>> >> > the >>> >> > cause for this? >>> >> > >>> >> > Thanks >>> >> > -Raj >>> > >>> > >>> >> >> > >
Re: GCInspector works every 10 seconds!
On Mon, Jun 18, 2012 at 12:07 AM, Jason Tang wrote: > After I enable key cache and row cache, the problem gone, I guess it because > we have lots of data in SSTable, and it takes more time, memory and cpu to > search the data. The Key Cache is usually a win if added like this. The Row cache is less likely to be. If I were you I would check your row cache hit rates to make sure you are actually getting win. :) =Rob -- =Robert Coli AIM>ALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: Snapshot failing on JSON files in 1.1.0
On Tue, Jun 19, 2012 at 8:55 PM, Rob Coli wrote: > On Tue, Jun 19, 2012 at 2:55 AM, Alain RODRIGUEZ wrote: >> Unable to create hard link from >> /raid0/cassandra/data/cassa_teads/stats_product-hc-233-Data.db to >> /raid0/cassandra/data/cassa_teads/snapshots/1340099026781/stats_product-hc-233-Data.db > > Are you able to create this hard link via the filesystem? I am conjecturing > not. FWIW, erno being given by OS and passed through Java is "1" : http://freespace.sourceforge.net/errno/linux.html " 1 EPERM+Operation not permitted " =Rob -- =Robert Coli AIM>ALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: Snapshot failing on JSON files in 1.1.0
On Tue, Jun 19, 2012 at 2:55 AM, Alain RODRIGUEZ wrote: > Unable to create hard link from > /raid0/cassandra/data/cassa_teads/stats_product-hc-233-Data.db to > /raid0/cassandra/data/cassa_teads/snapshots/1340099026781/stats_product-hc-233-Data.db Are you able to create this hard link via the filesystem? I am conjecturing not. Is "snapshots" perhaps on a different mountpoint than the directory you are trying to create a snapshot via hardlinks? =Rob PS - boy, 9 emails in the thread.. full of log output, sure don't miss them not being bottom-quoted to every email... :) -- =Robert Coli AIM>ALK - rc...@palominodb.com YAHOO - rcoli.palominob SKYPE - rcoli_palominodb
Re: cassandra secondary index with
Hi Jonathan, thanks for the reference. will read up on it. Yuhan
Re: cassandra secondary index with
That this will get you *worse* performance than just doing a seq scan would. Details as to why this is, are here: http://www.datastax.com/dev/blog/whats-new-cassandra-07-secondary-indexes On Tue, Jun 19, 2012 at 2:48 PM, Yuhan Zhang wrote: > To anwser my own question: > > There should be at least on "equal" expression in the indexed query to > combine with a "gte". > so, I just added an trivial column that stays constant for equal comparison. > and it works. > > not sure why this requirement exists. > > Thank you. > > Yuhan > > > On Tue, Jun 19, 2012 at 12:23 PM, Yuhan Zhang wrote: >> >> Hi all, >> >> I'm trying to search by the secondary index of cassandra with "greater >> than or equal". but reached an exception stating: >> me.prettyprint.hector.api.exceptions.HInvalidRequestException: >> InvalidRequestException(why:No indexed columns present in index clause with >> operator EQ) >> >> However, the same column family with the same column, work when the search >> expression is an "equal". I'm using the Hector java client. >> The secondary index type has been set to: {column_name: sport, >> validation_class: DoubleType, index_type:KEYS } >> >> here's the code reaching the exception: >> >> public QueryResult> >> getIndexedSlicesGTE(String columnFamily, String columnName, double value, >> String... columns) { >> Keyspace keyspace = getKeyspace(); >> StringSerializer se = CassandraStorage.getStringExtractor(); >> >> IndexedSlicesQuery indexedSlicesQuery = >> createIndexedSlicesQuery(keyspace, se, se, DoubleSerializer.get()); >> indexedSlicesQuery.setColumnFamily(columnFamily); >> indexedSlicesQuery.setStartKey(""); >> if(columns != null) >> indexedSlicesQuery.setColumnNames(columns); >> else { >> indexedSlicesQuery.setRange("", "", true, MAX_RECORD_NUMBER); >> } >> >> indexedSlicesQuery.setRowCount(CassandraStorage.MAX_RECORD_NUMBER); >> indexedSlicesQuery.addGteExpression(columnName, value); >> // this doesn't work :( >> //indexedSlicesQuery.addEqualsExpression(columnName, value); // >> this works! >> QueryResult> result = >> indexedSlicesQuery.execute(); >> >> return result; >> } >> >> >> Is there any column_meta setting that is required in order to make GTE >> comparison works on secondary index? >> >> Thank you. >> >> Yuhan Zhang >> >> >> > > > > -- > Yuhan Zhang > Application Developer > OneScreen Inc. > yzh...@onescreen.com > www.onescreen.com > > The information contained in this e-mail is for the exclusive use of the > intended recipient(s) and may be confidential, proprietary, and/or legally > privileged. Inadvertent disclosure of this message does not constitute a > waiver of any privilege. If you receive this message in error, please do > not directly or indirectly print, copy, retransmit, disseminate, or > otherwise use the information. In addition, please delete this e-mail and > all copies and notify the sender. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Row caching in Cassandra 1.1 by column family
rows_cached is actually obsolete in 1.1. New hotness explained here: http://www.datastax.com/dev/blog/caching-in-cassandra-1-1 On Mon, Jun 18, 2012 at 7:43 PM, Chris Burroughs wrote: > Check out the "rows_cached" CF attribute. > > On 06/18/2012 06:01 PM, Oleg Dulin wrote: >> Dear distinguished colleagues: >> >> I don't want all of my CFs cached, but one in particular I do. >> >> How can I configure that ? >> >> Thanks, >> Oleg >> > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Rules for Major Compaction
On Tue, Jun 19, 2012 at 2:30 PM, Edward Capriolo wrote: > You final two sentences are good ground rules. In our case we have > some column families that have high churn, for example a gc_grace > period of 4 days but the data is re-written completely every day. > Write activity over time will eventually cause tombstone removal but > we can expedite the process by forcing a major at night. Because the > tables are not really growing the **warning** below does not apply. Note that Cassandra 1.2 will automatically compact sstables that have more than a configurable amount of expired data (default 20%). So you won't have to force a major for this use case anymore. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Rules for Major Compaction
Thanks Ed. I am on 0.8.4. So I don't have Leveled option, only SizeTiered. I have a strange problem. I have a 6 node cluster(DC1=3, DC2=3). One of the nodes has 105 GB data where as every other node has 60 GB in spite of each one being a replica of the other. And I am contemplating whether I should be running compact/cleanup on the node with 105GB. Btw side question, does it make sense to run it just for 1 node or is it advisable to run it for all? This node also giving me some issues lately. Last night during some heavy load, I got a lot of TimedOutExceptions from this node. The node was also flapping. I could see in the logs that it could see the peers dying ans coming back up, utlimately throwing UnavailableException(and sometimes TimedOutException) on my requests. I use JNA mlockAll. So the JVM is definitely not swapping. I see a full GC running(according to GCInspector) for 15 secondsaround the same time. But even after the GC, requests were timing out. Cassandra runs with Xmx8G, Xmn800M. Total RAM on the machine 62GB. I don't use any meaningful Key cache or row cache and rely on OS file cache. Top shows VIRT as 116G(which makes sense since I have 105GB data). Have you seen any issues with data this size on a node? -Raj On Tue, Jun 19, 2012 at 3:30 PM, Edward Capriolo wrote: > Hey my favorite question! It is a loaded question and it depends on > your workload. The answer has evolved over time. > > In the old days <0.6.5 the only way to remove tombstones was major > compaction. This is not true in any modern version. > > (Also in the old days you had to run cleanup to clear hints) > > Cassandra now has two compaction strategies SizeTiered and Leveled. > Leveled DB can not be manually compacted. > > > You final two sentences are good ground rules. In our case we have > some column families that have high churn, for example a gc_grace > period of 4 days but the data is re-written completely every day. > Write activity over time will eventually cause tombstone removal but > we can expedite the process by forcing a major at night. Because the > tables are not really growing the **warning** below does not apply. > > **Warning** this creates one large sstable. Which is not always > desirable, because it fiddles with the heuristics of SizeTiered > (having one big table and other smaller ones). > > The updated answer is "You probably do not want to run major > compactions, but some use cases could see some benefits" > > On Tue, Jun 19, 2012 at 10:51 AM, Raj N wrote: > > DataStax recommends not to run major compactions. Edward Capriolo's > > Cassandra High Performance book suggests that major compaction is a good > > thing. And should be run on a regular basis. Are there any ground rules > > about running major compactions? For example, if you have write-once > kind of > > data that is never updated then it probably makes sense to not run major > > compaction. But if you have data which can be deleted or overwritten > does it > > make sense to run major compaction on a regular basis? > > > > Thanks > > -Raj >
Re: release of cassandra-unit 1.1.0.1
Hi Jeremy, Glad to see the update. It would be nice if secondary index in cassandra-unit supports DoubleType. Yuhan On Wed, Jun 13, 2012 at 1:32 PM, Jérémy SEVELLEC wrote: > Hi all, > > cassandra-unit 1.1.0.1 is now release. cassandra-unit helps you writing > isolated Junit Test using cassandra (starting an embedded cassandra > instance, load data from a dataset, ...) in a Test Driven Development style > or not :-). > > The artifact is published on the public maven repo. > > Main new features are : > - updating to hector 1.1-0 and cassandra-all 1.1.1 > - allow to set more options on column family > > You can see all the detailed content of this release here : > https://github.com/jsevellec/cassandra-unit/wiki/changelog > > cassandra-unit documentation : > https://github.com/jsevellec/cassandra-unit/wiki > cassandra-unit examples : > https://github.com/jsevellec/cassandra-unit-examples > > This can perhaps help... > > Regards, > > -- > Jérémy > -- Yuhan Zhang Application Developer OneScreen Inc. yzh...@onescreen.com www.onescreen.com The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly print, copy, retransmit, disseminate, or otherwise use the information. In addition, please delete this e-mail and all copies and notify the sender.
Re: cassandra secondary index with
To anwser my own question: There should be at least on "equal" expression in the indexed query to combine with a "gte". so, I just added an trivial column that stays constant for equal comparison. and it works. not sure why this requirement exists. Thank you. Yuhan On Tue, Jun 19, 2012 at 12:23 PM, Yuhan Zhang wrote: > Hi all, > > I'm trying to search by the secondary index of cassandra with "greater > than or equal". but reached an exception stating: > me.prettyprint.hector.api.exceptions.HInvalidRequestException: > InvalidRequestException(why:No indexed columns present in index clause with > operator EQ) > > However, the same column family with the same column, work when the search > expression is an "equal". I'm using the Hector java client. > The secondary index type has been set to: {column_name: sport, > validation_class: DoubleType, index_type:KEYS } > > here's the code reaching the exception: > > public QueryResult> > getIndexedSlicesGTE(String columnFamily, String columnName, double value, > String... columns) { > Keyspace keyspace = getKeyspace(); > StringSerializer se = CassandraStorage.getStringExtractor(); > > IndexedSlicesQuery indexedSlicesQuery = > createIndexedSlicesQuery(keyspace, se, se, DoubleSerializer.get()); > indexedSlicesQuery.setColumnFamily(columnFamily); > indexedSlicesQuery.setStartKey(""); > if(columns != null) > indexedSlicesQuery.setColumnNames(columns); > else { > indexedSlicesQuery.setRange("", "", true, MAX_RECORD_NUMBER); > } > indexedSlicesQuery.setRowCount(CassandraStorage.MAX_RECORD_NUMBER); > indexedSlicesQuery.addGteExpression(columnName, value); > // this doesn't work :( > //indexedSlicesQuery.addEqualsExpression(columnName, value);// > this works! > QueryResult> result = > indexedSlicesQuery.execute(); > > return result; > } > > > Is there any column_meta setting that is required in order to make GTE > comparison works on secondary index? > > Thank you. > > Yuhan Zhang > > > > -- Yuhan Zhang Application Developer OneScreen Inc. yzh...@onescreen.com www.onescreen.com The information contained in this e-mail is for the exclusive use of the intended recipient(s) and may be confidential, proprietary, and/or legally privileged. Inadvertent disclosure of this message does not constitute a waiver of any privilege. If you receive this message in error, please do not directly or indirectly print, copy, retransmit, disseminate, or otherwise use the information. In addition, please delete this e-mail and all copies and notify the sender.
Re: Rules for Major Compaction
Hey my favorite question! It is a loaded question and it depends on your workload. The answer has evolved over time. In the old days <0.6.5 the only way to remove tombstones was major compaction. This is not true in any modern version. (Also in the old days you had to run cleanup to clear hints) Cassandra now has two compaction strategies SizeTiered and Leveled. Leveled DB can not be manually compacted. You final two sentences are good ground rules. In our case we have some column families that have high churn, for example a gc_grace period of 4 days but the data is re-written completely every day. Write activity over time will eventually cause tombstone removal but we can expedite the process by forcing a major at night. Because the tables are not really growing the **warning** below does not apply. **Warning** this creates one large sstable. Which is not always desirable, because it fiddles with the heuristics of SizeTiered (having one big table and other smaller ones). The updated answer is "You probably do not want to run major compactions, but some use cases could see some benefits" On Tue, Jun 19, 2012 at 10:51 AM, Raj N wrote: > DataStax recommends not to run major compactions. Edward Capriolo's > Cassandra High Performance book suggests that major compaction is a good > thing. And should be run on a regular basis. Are there any ground rules > about running major compactions? For example, if you have write-once kind of > data that is never updated then it probably makes sense to not run major > compaction. But if you have data which can be deleted or overwritten does it > make sense to run major compaction on a regular basis? > > Thanks > -Raj
cassandra secondary index with
Hi all, I'm trying to search by the secondary index of cassandra with "greater than or equal". but reached an exception stating: me.prettyprint.hector.api.exceptions.HInvalidRequestException: InvalidRequestException(why:No indexed columns present in index clause with operator EQ) However, the same column family with the same column, work when the search expression is an "equal". I'm using the Hector java client. The secondary index type has been set to: {column_name: sport, validation_class: DoubleType, index_type:KEYS } here's the code reaching the exception: public QueryResult> getIndexedSlicesGTE(String columnFamily, String columnName, double value, String... columns) { Keyspace keyspace = getKeyspace(); StringSerializer se = CassandraStorage.getStringExtractor(); IndexedSlicesQuery indexedSlicesQuery = createIndexedSlicesQuery(keyspace, se, se, DoubleSerializer.get()); indexedSlicesQuery.setColumnFamily(columnFamily); indexedSlicesQuery.setStartKey(""); if(columns != null) indexedSlicesQuery.setColumnNames(columns); else { indexedSlicesQuery.setRange("", "", true, MAX_RECORD_NUMBER); } indexedSlicesQuery.setRowCount(CassandraStorage.MAX_RECORD_NUMBER); indexedSlicesQuery.addGteExpression(columnName, value); // this doesn't work :( //indexedSlicesQuery.addEqualsExpression(columnName, value);// this works! QueryResult> result = indexedSlicesQuery.execute(); return result; } Is there any column_meta setting that is required in order to make GTE comparison works on secondary index? Thank you. Yuhan Zhang
Unable to update CFs with duplicate index names
Hello We started using cassandra at version 0.7, which allowed duplicate names for indexes. We upgraded to version 0.8.10 a while ago and everything has been working fine. Now I am not able to run 'update column family' on CF with duplicate index names with other CFs. If I update the CF with same index names, I am getting "Duplicate index name userId". If I update with different index names or without index names, I am getting "Cannot modify index name". The only info I can find is https://issues.apache.org/jira/browse/CASSANDRA-2903, but it does not say anything about existing duplicate indexes. Thanks
Rules for Major Compaction
DataStax recommends not to run major compactions. Edward Capriolo's Cassandra High Performance book suggests that major compaction is a good thing. And should be run on a regular basis. Are there any ground rules about running major compactions? For example, if you have write-once kind of data that is never updated then it probably makes sense to not run major compaction. But if you have data which can be deleted or overwritten does it make sense to run major compaction on a regular basis? Thanks -Raj
Re: Snapshot failing on JSON files in 1.1.0
Hi again, apt-get install libjna-java installed nothing, I was already up to date. I made the symbolic link jna.jar to target jna-3.4.1.jar (downloaded @ the given link) instead of jna-3.2.4.jar. I could restart with the 'JNA mlockall successful' message. I am still unable to snapshot my data. I got the following output : Exception in thread "main" java.io.IOError: java.io.IOException: Unable to create hard link from /raid0/cassandra/data/cassa_teads/stats_product-hc-233-Data.db to /raid0/cassandra/data/cassa_teads/snapshots/1340099026781/stats_product-hc-233-Data.db (errno 1) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1433) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.Table.snapshot(Table.java:210) at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:1710) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:93) at com.sun.jmx.mbeanserver.StandardMBeanIntrospector.invokeM2(StandardMBeanIntrospector.java:27) at com.sun.jmx.mbeanserver.MBeanIntrospector.invokeM(MBeanIntrospector.java:208) at com.sun.jmx.mbeanserver.PerInterface.invoke(PerInterface.java:120) at com.sun.jmx.mbeanserver.MBeanSupport.invoke(MBeanSupport.java:262) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.invoke(DefaultMBeanServerInterceptor.java:836) at com.sun.jmx.mbeanserver.JmxMBeanServer.invoke(JmxMBeanServer.java:761) at javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1427) at javax.management.remote.rmi.RMIConnectionImpl.access$200(RMIConnectionImpl.java:72) at javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1265) at javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1360) at javax.management.remote.rmi.RMIConnectionImpl.invoke(RMIConnectionImpl.java:788) at sun.reflect.GeneratedMethodAccessor42.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:303) at sun.rmi.transport.Transport$1.run(Transport.java:159) at java.security.AccessController.doPrivileged(Native Method) at sun.rmi.transport.Transport.serviceCall(Transport.java:155) at sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:535) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:790) at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:649) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Caused by: java.io.IOException: Unable to create hard link from /raid0/cassandra/data/cassa_teads/stats_product-hc-233-Data.db to /raid0/cassandra/data/cassa_teads/snapshots/1340099026781/stats_product-hc-233-Data.db (errno 1) at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:158) at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:857) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1412) ... 32 more Logs tell me this : ERROR 09:43:46,840 Unable to create hard link com.sun.jna.LastErrorException: [1]ÃX at org.apache.cassandra.utils.CLibrary.link(Native Method) at org.apache.cassandra.utils.CLibrary.createHardLink(CLibrary.java:145) at org.apache.cassandra.io.sstable.SSTableReader.createLinks(SSTableReader.java:857) at org.apache.cassandra.db.ColumnFamilyStore.snapshotWithoutFlush(ColumnFamilyStore.java:1412) at org.apache.cassandra.db.ColumnFamilyStore.snapshot(ColumnFamilyStore.java:1462) at org.apache.cassandra.db.Table.snapshot(Table.java:210) at org.apache.cassandra.service.StorageService.takeSnapshot(StorageService.java:1710) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597)
Re: Problem with streaming with sstableloader into ubuntu node
The code is processing the file name, without the path and appears to be correct. Can you show the full error (including any other output) and the directory / files you are running the bulk load against when in windows ? Bulk load expects keyspace/ column_family/ sstable-file Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/06/2012, at 3:13 AM, Nury Redjepow wrote: > Okay, we investigated the problem and found the source of proble in package > org.apache.cassandra.io.sstable; > > public class Descriptor > > public static Pair fromFilename(File directory, String > name) > { > // tokenize the filename > StringTokenizer st = new StringTokenizer(name, String.valueOf(separator)); > String nexttok; > > if bulkloader running from windows and cassandra running under Ubuntu, > directory is > ("KeySpaceName\\ColumnFamilyName\\KeySpaceName-ColumnFamilyName-hc-177-Data.db" > > > so at next rows > String ksname = st.nextToken(); > String cfname = st.nextToken(); > > ksname becomes "KeySpaceName\\ColumnFamilyName\\KeySpaceName" > > > Sincerely, Nury. > > > > > Mon, 18 Jun 2012 15:40:17 +1200 от aaron morton : > Cross platform clusters are not really supported. > > That said it sounds like a bug. If you can create some steps to reproduce it > please create a ticket here https://issues.apache.org/jira/browse/CASSANDRA > it may get looked it. > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 16/06/2012, at 12:41 AM, Nury Redjepow wrote: > >> Good day, everyone >> >> We are using sstableloader to bulk insert data into cassandra. >> >> Script is executed on developers machine with Windows to Single Node >> Cassandra. >> >> "%JAVA_HOME%\bin\java" -ea -cp %CASSANDRA_CLASSPATH% -Xmx256M >> -Dlog4j.configuration=log4j-tools.properties >> org.apache.cassandra.tools.BulkLoader -d 10.0.3.37 --debug -v >> "DestinationPrices/PricesByHotel" >> >> This works fine if destination cassandra is working under windows, but >> doesn't work with ubuntu instance. Cli is able to connect, but sstable seem >> to have problem with keyspace name. Logs in ubuntu instance show error >> messages like: >> >> ERROR [Thread-41] 2012-06-15 16:05:47,620 AbstractCassandraDaemon.java (line >> 134) Exception in thread Thread[Thread-41,5,main] >> java.lang.AssertionError: Unknown keyspace >> DestinationPrices\PricesByHotel\DestinationPrices >> >> >> In our schema we have keyspace DestinationPrices, and column family >> PricesByHotel. Somehow it's not accepted properly. >> >> So my question is, how should I specify keyspace name in command, to make it >> work correctly with Ubuntu? >> > >
Re: Change of behaviour in multiget_slice query for unknown keys between 0.7 and 1.1?
Nothing has changed in the server, try the Hector user group. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/06/2012, at 12:02 PM, Edward Sargisson wrote: > Hi all, > Was there a change of behaviour in multiget_slice query in Cassandra or > Hector between 0.7 and 1.1 when dealing with a key that doesn't exist? > > We've just upgraded and our in memory unit test is failing (although just on > my machine). The test code is looking for a key that doesn't exist and > expects to get null. Instead it gets a ColumnSlice with a single column > called val. If there were something there then we'd expect columns with names > like bytes, int or string. Other rows in the column family have those columns > as well as val. > > Is there a reason for this behaviour? > I'd like to see if there was an explanation before I change the unit test for > it. > > Many thanks in advance, > Edward > > -- > Edward Sargisson > senior java developer > Global Relay > > edward.sargis...@globalrelay.net > > > 866.484.6630 > New York | Chicago | Vancouver | London (+44.0800.032.9829) | Singapore > (+65.3158.1301) > > Global Relay Archive supports email, instant messaging, BlackBerry, > Bloomberg, Thomson Reuters, Pivot, YellowJacket, LinkedIn, Twitter, Facebook > and more. > > Ask about Global Relay Message — The Future of Collaboration in the Financial > Services World > > All email sent to or from this address will be retained by Global Relay’s > email archiving system. This message is intended only for the use of the > individual or entity to which it is addressed, and may contain information > that is privileged, confidential, and exempt from disclosure under applicable > law. Global Relay will not be liable for any compliance or technical > information provided herein. All trademarks are the property of their > respective owners.