Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
Not sure if anyone has seen this before but it's really killing me right
now.  Perhaps that was too long of a description of the issue so here's a
more succinct question -- How do I remove nodes associated with a cluster
that contain no data and have no reason to be associated with the cluster
whatsoever?

My last resort here is to stop cassandra (after recording all tokens for
each node), set the initial token for each node in the cluster in
cassandra.yaml, manually delete the LocationInfo* sstables in the system
keyspace, and then restart.  I'm hoping there's a simpler, less seemingly
risky way to do this so please, please let me know if that's true!

Thanks again.

- Eric

On Tue, Oct 11, 2011 at 11:55 AM, Eric Czech e...@nextbigsound.com wrote:

 Hi, I'm having what I think is a fairly uncommon schema issue --

 My situation is that I had a cluster with 10 nodes and a consistent schema.
  Then, in an experiment to setup a second cluster with the same information
 (by copying the raw sstables), I left the LocationInfo* sstables in the
 system keyspace in the new cluster and after starting the second cluster, I
 realized that the two clusters were discovering each other when they
 shouldn't have been.  Since then, I changed the cluster name for the second
 cluster and made sure to delete the LocationInfo* sstables before starting
 it and the two clusters are now operating independent of one another for the
 most part.  The only remaining connection between the two seems to be that
 the first cluster is still maintaining references to nodes in the second
 cluster in the schema versions despite those nodes not actually being part
 of the ring.

 Here's what my describe cluster looks like on the original cluster:

 Cluster Information:
Snitch: org.apache.cassandra.locator.SimpleSnitch
Partitioner: org.apache.cassandra.dht.RandomPartitioner
Schema versions:
 48971cb0-e9ff-11e0--eb9eab7d90bf: [INTENTIONAL_IP1,
 INTENTIONAL_IP2, ..., INTENTIONAL_IP10]
  848bcfc0-eddf-11e0--8a3bb58f08ff: [NOT_INTENTIONAL_IP1,
 NOT_INTENTIONAL_IP2]

 The second cluster, however, contains no schema versions involving nodes
 from the first cluster.

 My question then is, how can I remove those schema versions from the
 original cluster that are associated with the unwanted nodes from the second
 cluster?  Is there any way to remove or evict an IP from a cluster instead
 of just a token?

 Thanks in advance!

 - Eric



Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
I don't think that's what I'm after here since the unwanted nodes were
originally assimilated into the cluster with the same initial_token values
as other nodes that were already in the cluster (that have, and still do
have, useful data).  I know this is an awkward situation so I'll try to
depict it in a simpler way:

Let's say I have a simplified version of our production cluster that looks
like this -

cass-1   token = A
cass-2   token = B
cass-3   token = C

Then I tried to create a second cluster that looks like this -

cass-analysis-1   token = A  (and contains same data as cass-1)
cass-analysis-2   token = B  (and contains same data as cass-2)
cass-analysis-3   token = C  (and contains same data as cass-3)

But after starting the second cluster, things got crossed up between the
clusters and here's what the original cluster now looks like -

cass-1   token = A   (has data and schema)
cass-2   token = B   (has data and schema)
cass-3   token = C   (had data and schema)
cass-analysis-1   token = A  (has *no* data and is not part of the ring, but
is trying to be included in cluster schema)

A simplified version of describe cluster  for the original cluster now
shows:

Cluster Information:
   Schema versions:
 SCHEMA-UUID-1: [cass-1, cass-2, cass-3]
SCHEMA-UUID-2: [cass-analysis-1]

But the simplified ring looks like this (has only 3 nodes instead of 4):

Host   Owns Token
cass-1 33%   A
cass-2 33%   B
cass-3 33%   C

The original cluster is still working correctly but all live schema updates
are failing because of the inconsistent schema versions introduced by the
unwanted node.

From my perspective, a simple fix seems to be for cassandra to exclude nodes
that aren't part of the ring from the schema consistency requirements.  Any
reason that wouldn't work?

And aside from a possible code patch, any recommendations as to how I can
best fix this given the current 8.4 release?


On Thu, Oct 13, 2011 at 12:14 AM, Jonathan Ellis jbel...@gmail.com wrote:

 Does nodetool removetoken not work?

 On Thu, Oct 13, 2011 at 12:59 AM, Eric Czech e...@nextbigsound.com
 wrote:
  Not sure if anyone has seen this before but it's really killing me right
  now.  Perhaps that was too long of a description of the issue so here's a
  more succinct question -- How do I remove nodes associated with a cluster
  that contain no data and have no reason to be associated with the cluster
  whatsoever?
  My last resort here is to stop cassandra (after recording all tokens for
  each node), set the initial token for each node in the cluster in
  cassandra.yaml, manually delete the LocationInfo* sstables in the system
  keyspace, and then restart.  I'm hoping there's a simpler, less seemingly
  risky way to do this so please, please let me know if that's true!
  Thanks again.
  - Eric
  On Tue, Oct 11, 2011 at 11:55 AM, Eric Czech e...@nextbigsound.com
 wrote:
 
  Hi, I'm having what I think is a fairly uncommon schema issue --
  My situation is that I had a cluster with 10 nodes and a consistent
  schema.  Then, in an experiment to setup a second cluster with the same
  information (by copying the raw sstables), I left the LocationInfo*
 sstables
  in the system keyspace in the new cluster and after starting the second
  cluster, I realized that the two clusters were discovering each other
 when
  they shouldn't have been.  Since then, I changed the cluster name for
 the
  second cluster and made sure to delete the LocationInfo* sstables before
  starting it and the two clusters are now operating independent of one
  another for the most part.  The only remaining connection between the
 two
  seems to be that the first cluster is still maintaining references to
 nodes
  in the second cluster in the schema versions despite those nodes not
  actually being part of the ring.
  Here's what my describe cluster looks like on the original cluster:
  Cluster Information:
 Snitch: org.apache.cassandra.locator.SimpleSnitch
 Partitioner: org.apache.cassandra.dht.RandomPartitioner
 Schema versions:
  48971cb0-e9ff-11e0--eb9eab7d90bf: [INTENTIONAL_IP1,
  INTENTIONAL_IP2, ..., INTENTIONAL_IP10]
  848bcfc0-eddf-11e0--8a3bb58f08ff: [NOT_INTENTIONAL_IP1,
  NOT_INTENTIONAL_IP2]
  The second cluster, however, contains no schema versions involving nodes
  from the first cluster.
  My question then is, how can I remove those schema versions from the
  original cluster that are associated with the unwanted nodes from the
 second
  cluster?  Is there any way to remove or evict an IP from a cluster
 instead
  of just a token?
  Thanks in advance!
  - Eric
 



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau

Hi Stephen,
this is a great idea but unfortunately doesn't work for us either as we 
can not store the data in an unencrypted form.


Kind regards
Matthias

On 10/12/2011 07:42 PM, Stephen Connolly wrote:

could you prefix the data with 3-4 bytes of a linear hash of the
unencypted data? it wouldn't be a perfect sort, but you'd have less of a
range to query to get the sorted values?

- Stephen

---
Sent from my Android phone, so random spelling mistakes, random nonsense
words and other nonsense are a direct result of using swype to type on
the screen

On 12 Oct 2011 17:57, Matthias Pfau p...@l3s.de mailto:p...@l3s.de
wrote:

Unfortunately, that is not an option as we have to store the data in
an compressed and encrypted and therefore binary and non-sortable form.

On 10/12/2011 06:39 PM, David McNelis wrote:

Is it an option to not convert the data to binary prior to inserting
into Cassandra?  Also, how large are the strings you're sorting?
  If its
viable to not convert to binary before writing to Cassandra, and
you use
one of the string based column ordering techniques (utf8, ascii, for
example), then the data would be sorted without you  needing to
specifically worry about that.  Of course, if the strings are
lengthy
you could run into  additional issues.

On Wed, Oct 12, 2011 at 11:34 AM, Matthias Pfau p...@l3s.de
mailto:p...@l3s.de
mailto:p...@l3s.de mailto:p...@l3s.de wrote:

Hi there,
we are currently building a prototype based on cassandra and
came
into problems on implementing sorted lists containing
millions of items.

The special thing about the items of our lists is, that
cassandra is
not able to sort them as the data is stored in a binary
format which
is not sortable. However, we are able to sort the data
before the
plain data gets encoded (our application is responsible for
the order).

First Approach: Storing Lists in ColumnFamilies
***
We first tried to map the list to a single row of a
ColumnFamily in
a way that the index of the list is mapped to the column
names and
the items of the list to the column values. The column names are
increasing numbers which define the sort order.
This has the major drawback that big parts of the list have
to be
rewritten on inserts (because the column names are numbered
by their
index), which are quite common.


Second Approach: Storing the whole List as Binary Data:
***
We tried to store the compressed list in a single column.
However,
this is only feasible for smaller lists. Our lists are far
to big
leading to multi megabyte reads and writes. As we need to
read and
update the lists quite often, this would put our Cassandra
cluster
under a lot of pressure.

Ideal Solution: Native support for storing lists
***
We would be very happy with a way to store a list of sorted
values
without making improper use of column names for the list
index. This
implies that we would need a possibility to insert values at
defined
positions. We know that this could lead to problems with
concurrent
inserts in a distributed environment, but this is handled by our
application logic.


What are your ideas on that?

Thanks
Matthias




--
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com http://www.agentisenergy.com
http://www.agentisenergy.com
c: 219.384.5143 tel:219.384.5143

/A Smart Grid technology company focused on helping consumers of
energy
control an often under-managed resource./







Re: Existing column(s) not readable

2011-10-13 Thread Thomas Richter
Hi Aaron,

I guess i found it :-).

I added logging for the used IndexInfo to
SSTableNamesIterator.readIndexedColumns and got negative index postions
for the missing columns. This is the reason why the columns are not
loaded from sstable.

So I had a look at ColumnIndexer.serializeInternal and there it is:

int endPosition = 0, startPosition = -1;

Should be:

long endPosition = 0, startPosition = -1;

I'm currently running a compaction with a fixed version to verify.

Best,

Thomas

On 10/12/2011 11:54 PM, aaron morton wrote:
 Sounds a lot like the column is deleted. 
 
 IIRC this is where the columns from various SSTables are reduced
 https://github.com/apache/cassandra/blob/cassandra-0.8/src/java/org/apache/cassandra/db/filter/QueryFilter.java#L117
 
 The call to ColumnFamily.addColumn() is where the column instance may be 
 merged with other instances. 
 
 A 
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com
 
 On 13/10/2011, at 5:33 AM, Thomas Richter wrote:
 
 Hi Aaron,

 I cannot read the column with a slice query.
 The slice query only returns data till a certain column and after that i
 only get empty results.

 I added log output to QueryFilter.isRelevant to see if the filter is
 dropping the column(s) but it doesn't even show up there.

 Next thing i will check check is the diff between columns contained in
 json export and columns fetched with the slice query, maybe this gives
 more clue...

 Any other ideas where to place more debugging output to see what's
 happening?

 Best,

 Thomas

 On 10/11/2011 12:46 PM, aaron morton wrote:
 kewl, 

 * Row is not deleted (other columns can be read, row survives compaction
 with GCGraceSeconds=0)

 IIRC row tombstones can hang around for a while (until gc grace has 
 passed), and they only have an effect on columns that have a lower 
 timstamp. So it's possible to read columns from a row with a tombstone. 

 Can you read the column using a slice range rather than specifying it's 
 name ? 

 Aaron

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 11/10/2011, at 11:15 PM, Thomas Richter wrote:

 Hi Aaron,

 i invalidated the caches but nothing changed. I didn't get the mentioned
 log line either, but as I read the code SliceByNamesReadCommand uses
 NamesQueryFilter and not SliceQueryFilter.

 Next, there is only one SSTable.

 I can rule out that the row is deleted because I deleted all other rows
 in that CF to reduce data size and speed up testing. I set
 GCGraceSeconds to zero and ran a compaction. All other rows are gone,
 but i can still access at least one column from the left row.
 So as far as I understand it, there should not be a tombstone on row level.

 To make it a list:

 * One SSTable, one row
 *
 * Row is not deleted (other columns can be read, row survives compaction
 with GCGraceSeconds=0)
 * Most columns can be read by get['row']['col'] from cassandra-cli
 * Some columns can not be read by get['row']['col'] from cassandra-cli
 but can be found in output of sstable2json
 * unreadable data survives compaction with GCGraceSeconds=0 (checked
 with sstable2json)
 * Invalidation caches does not help
 * Nothing in the logs

 Does that point into any direction where i should look next?

 Best,

 Thomas

 On 10/11/2011 10:30 AM, aaron morton wrote:
 Nothing jumps out. The obvious answer is that the column has been 
 deleted. Did you check all the SSTables ?

 It looks like query returned from row cache, otherwise you would see this 
 as well…

 DEBUG [ReadStage:34] 2011-10-11 21:11:11,484 SliceQueryFilter.java (line 
 123) collecting 0 of 2147483647: 
 1318294191654059:false:354@1318294191654861

 Which would mean a version of the column was found. 

 If you invalidate the cache with nodetool and run the query and the log 
 message appears it will mean the column was read from (all of the) 
 sstables. If you do not get a column returned I would say there is a 
 tombstone in place. It's either a row level or a column level one.  

 Hope that helps. 

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 11/10/2011, at 10:35 AM, Thomas Richter wrote:

 Hi Aaron,

 normally we use hector to access cassandra, but for debugging I switched
 to cassandra-cli.

 Column can not be read by a simple
 get CFName['rowkey']['colname'];

 Response is Value was not found
 if i query another column, everything is just fine.

 Serverlog for unsuccessful read (keyspace and CF names replaced):

 DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,739 CassandraServer.java
 (line 280) get

 DEBUG [pool-1-thread-1] 2011-10-10 23:15:29,744 StorageProxy.java (line
 320) Command/ConsistencyLevel is
 SliceByNamesReadCommand(table='Keyspace',
 

Re: Cassandra as session store under heavy load

2011-10-13 Thread Maciej Miklas
durable_writes sounds great - thank you! I really do not need commit log
here.

Another question: it is possible to configure live time of Tombstones?


Regards,
Maciej


Re: Storing pre-sorted data

2011-10-13 Thread Zach Richardson
Matthias,

This is an interesting problem.

I would consider using long's as the column type, where your column
names are evenly distributed longs in sort order when you first write
your list out.  So if you have items A and C with the long column
names 1000 and 2000, and then you have to insert B, it gets inserted
at 1500.  Once you run out of room between any two column name
entries, i.e 1000, 1001, 1002 entries are all taken at any spot in the
list, go ahead and re-write the list.

If your unencrypted data is uniformly distributed, you will have very
few collisions on your column names and should not have to re-write
the list to often.

If your lists are small enough, then you could use ints to save space,
but will then have to re-write the list more often.

Thanks,

Zach

On Thu, Oct 13, 2011 at 2:47 AM, Matthias Pfau p...@l3s.de wrote:
 Hi Stephen,
 this is a great idea but unfortunately doesn't work for us either as we can
 not store the data in an unencrypted form.

 Kind regards
 Matthias

 On 10/12/2011 07:42 PM, Stephen Connolly wrote:

 could you prefix the data with 3-4 bytes of a linear hash of the
 unencypted data? it wouldn't be a perfect sort, but you'd have less of a
 range to query to get the sorted values?

 - Stephen

 ---
 Sent from my Android phone, so random spelling mistakes, random nonsense
 words and other nonsense are a direct result of using swype to type on
 the screen

 On 12 Oct 2011 17:57, Matthias Pfau p...@l3s.de mailto:p...@l3s.de
 wrote:

    Unfortunately, that is not an option as we have to store the data in
    an compressed and encrypted and therefore binary and non-sortable form.

    On 10/12/2011 06:39 PM, David McNelis wrote:

        Is it an option to not convert the data to binary prior to
 inserting
        into Cassandra?  Also, how large are the strings you're sorting?
          If its
        viable to not convert to binary before writing to Cassandra, and
        you use
        one of the string based column ordering techniques (utf8, ascii,
 for
        example), then the data would be sorted without you  needing to
        specifically worry about that.  Of course, if the strings are
        lengthy
        you could run into  additional issues.

        On Wed, Oct 12, 2011 at 11:34 AM, Matthias Pfau p...@l3s.de
        mailto:p...@l3s.de
        mailto:p...@l3s.de mailto:p...@l3s.de wrote:

            Hi there,
            we are currently building a prototype based on cassandra and
        came
            into problems on implementing sorted lists containing
        millions of items.

            The special thing about the items of our lists is, that
        cassandra is
            not able to sort them as the data is stored in a binary
        format which
            is not sortable. However, we are able to sort the data
        before the
            plain data gets encoded (our application is responsible for
        the order).

            First Approach: Storing Lists in ColumnFamilies
            ***
            We first tried to map the list to a single row of a
        ColumnFamily in
            a way that the index of the list is mapped to the column
        names and
            the items of the list to the column values. The column names
 are
            increasing numbers which define the sort order.
            This has the major drawback that big parts of the list have
        to be
            rewritten on inserts (because the column names are numbered
        by their
            index), which are quite common.


            Second Approach: Storing the whole List as Binary Data:
            ***
            We tried to store the compressed list in a single column.
        However,
            this is only feasible for smaller lists. Our lists are far
        to big
            leading to multi megabyte reads and writes. As we need to
        read and
            update the lists quite often, this would put our Cassandra
        cluster
            under a lot of pressure.

            Ideal Solution: Native support for storing lists
            ***
            We would be very happy with a way to store a list of sorted
        values
            without making improper use of column names for the list
        index. This
            implies that we would need a possibility to insert values at
        defined
            positions. We know that this could lead to problems with
        concurrent
            inserts in a distributed environment, but this is handled by
 our
            application logic.


            What are your ideas on that?

            Thanks
            Matthias




        --
        *David McNelis*
        Lead Software Engineer
        Agentis Energy
        www.agentisenergy.com http://www.agentisenergy.com
        http://www.agentisenergy.com
        c: 219.384.5143 tel:219.384.5143

        /A Smart Grid technology company focused on helping consumers of
        energy
        control an often 

Re: [Solved] column index offset miscalculation (was: Existing column(s) not readable)

2011-10-13 Thread Sylvain Lebresne
JIRA is not read-only, you should be able to create a ticket at
https://issues.apache.org/jira/browse/CASSANDRA, though
that probably require that you create an account.

--
Sylvain

On Thu, Oct 13, 2011 at 3:20 PM, Thomas Richter t...@tricnet.de wrote:
 Hi Aaron,

 the fix does the trick. I wonder why nobody else ran into this before...
 I checked org/apache/cassandra/db/ColumnIndexer.java in 0.7.9, 0.8.7 and
 1.0.0-rc2 and all seem to be affected.

 Looks like public Jira is readonly - so I'm not sure about how to continue.

 Best,

 Thomas

 On 10/13/2011 10:52 AM, Thomas Richter wrote:
 Hi Aaron,

 I guess i found it :-).

 I added logging for the used IndexInfo to
 SSTableNamesIterator.readIndexedColumns and got negative index postions
 for the missing columns. This is the reason why the columns are not
 loaded from sstable.

 So I had a look at ColumnIndexer.serializeInternal and there it is:

 int endPosition = 0, startPosition = -1;

 Should be:

 long endPosition = 0, startPosition = -1;

 I'm currently running a compaction with a fixed version to verify.

 Best,

 Thomas

 On 10/12/2011 11:54 PM, aaron morton wrote:
 Sounds a lot like the column is deleted.

 IIRC this is where the columns from various SSTables are reduced
 https://github.com/apache/cassandra/blob/cassandra-0.8/src/java/org/apache/cassandra/db/filter/QueryFilter.java#L117

 The call to ColumnFamily.addColumn() is where the column instance may be 
 merged with other instances.

 A

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 13/10/2011, at 5:33 AM, Thomas Richter wrote:

 Hi Aaron,

 I cannot read the column with a slice query.
 The slice query only returns data till a certain column and after that i
 only get empty results.

 I added log output to QueryFilter.isRelevant to see if the filter is
 dropping the column(s) but it doesn't even show up there.

 Next thing i will check check is the diff between columns contained in
 json export and columns fetched with the slice query, maybe this gives
 more clue...

 Any other ideas where to place more debugging output to see what's
 happening?

 Best,

 Thomas

 On 10/11/2011 12:46 PM, aaron morton wrote:
 kewl,

 * Row is not deleted (other columns can be read, row survives compaction
 with GCGraceSeconds=0)

 IIRC row tombstones can hang around for a while (until gc grace has 
 passed), and they only have an effect on columns that have a lower 
 timstamp. So it's possible to read columns from a row with a tombstone.

 Can you read the column using a slice range rather than specifying it's 
 name ?

 Aaron

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 11/10/2011, at 11:15 PM, Thomas Richter wrote:

 Hi Aaron,

 i invalidated the caches but nothing changed. I didn't get the mentioned
 log line either, but as I read the code SliceByNamesReadCommand uses
 NamesQueryFilter and not SliceQueryFilter.

 Next, there is only one SSTable.

 I can rule out that the row is deleted because I deleted all other rows
 in that CF to reduce data size and speed up testing. I set
 GCGraceSeconds to zero and ran a compaction. All other rows are gone,
 but i can still access at least one column from the left row.
 So as far as I understand it, there should not be a tombstone on row 
 level.

 To make it a list:

 * One SSTable, one row
 *
 * Row is not deleted (other columns can be read, row survives compaction
 with GCGraceSeconds=0)
 * Most columns can be read by get['row']['col'] from cassandra-cli
 * Some columns can not be read by get['row']['col'] from cassandra-cli
 but can be found in output of sstable2json
 * unreadable data survives compaction with GCGraceSeconds=0 (checked
 with sstable2json)
 * Invalidation caches does not help
 * Nothing in the logs

 Does that point into any direction where i should look next?

 Best,

 Thomas

 On 10/11/2011 10:30 AM, aaron morton wrote:
 Nothing jumps out. The obvious answer is that the column has been 
 deleted. Did you check all the SSTables ?

 It looks like query returned from row cache, otherwise you would see 
 this as well…

 DEBUG [ReadStage:34] 2011-10-11 21:11:11,484 SliceQueryFilter.java 
 (line 123) collecting 0 of 2147483647: 
 1318294191654059:false:354@1318294191654861

 Which would mean a version of the column was found.

 If you invalidate the cache with nodetool and run the query and the log 
 message appears it will mean the column was read from (all of the) 
 sstables. If you do not get a column returned I would say there is a 
 tombstone in place. It's either a row level or a column level one.

 Hope that helps.

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 11/10/2011, at 10:35 AM, Thomas Richter wrote:

 Hi Aaron,

 normally we 

Re: supercolumns vs. prefixing columns of same data type?

2011-10-13 Thread hani elabed
Hi Dean,
I don't have have an answer to your question, but just in case you haven't
seen this screencast by Ed Anuff on Cassandra Indexes, it helped me a lot.
http://blip.tv/datastax/indexing-in-cassandra-5495633

Hani

On Wed, Oct 12, 2011 at 12:18 PM, Dean Hiller d...@alvazan.com wrote:

 I heard cassandra may be going the direction of removing super column and
 users are starting to just use prefixes in front of the column.

 The reason I ask is I was going the way of only using supercolumns and then
 many tables were fixed with just one supercolumn per row as the structure
 for that table was simplethis kept the api we have on top of Hector
 extremely simple not having to deal with columns vs. supercolumns.  What are
 people's thoughts on this?

 Dealing in columnfamilies where some have supercolumns and some don't I
 think personally is a painful way to go.going with just one way and
 sticking with it sure makes the apis easier and it's much easier to apply
 AOP type stuff to that ONE insert method rather than having two insert
 methods.  So what is the direction of casssandra project and the
 recommendation?

 thanks,
 Dean



Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Mohit Anchlia
Do you have same seed node specified in cass-analysis-1 as cass-1,2,3?
I am thinking that changing the seed node in cass-analysis-2 and
following the directions in
http://wiki.apache.org/cassandra/FAQ#schema_disagreement might solve
the problem. Somone please correct me.

On Thu, Oct 13, 2011 at 12:05 AM, Eric Czech e...@nextbigsound.com wrote:
 I don't think that's what I'm after here since the unwanted nodes were
 originally assimilated into the cluster with the same initial_token values
 as other nodes that were already in the cluster (that have, and still do
 have, useful data).  I know this is an awkward situation so I'll try to
 depict it in a simpler way:
 Let's say I have a simplified version of our production cluster that looks
 like this -
 cass-1   token = A
 cass-2   token = B
 cass-3   token = C
 Then I tried to create a second cluster that looks like this -
 cass-analysis-1   token = A  (and contains same data as cass-1)
 cass-analysis-2   token = B  (and contains same data as cass-2)
 cass-analysis-3   token = C  (and contains same data as cass-3)
 But after starting the second cluster, things got crossed up between the
 clusters and here's what the original cluster now looks like -
 cass-1   token = A   (has data and schema)
 cass-2   token = B   (has data and schema)
 cass-3   token = C   (had data and schema)
 cass-analysis-1   token = A  (has *no* data and is not part of the ring, but
 is trying to be included in cluster schema)
 A simplified version of describe cluster  for the original cluster now
 shows:
 Cluster Information:
    Schema versions:
 SCHEMA-UUID-1: [cass-1, cass-2, cass-3]
 SCHEMA-UUID-2: [cass-analysis-1]
 But the simplified ring looks like this (has only 3 nodes instead of 4):
 Host       Owns     Token
 cass-1     33%       A
 cass-2     33%       B
 cass-3     33%       C
 The original cluster is still working correctly but all live schema updates
 are failing because of the inconsistent schema versions introduced by the
 unwanted node.
 From my perspective, a simple fix seems to be for cassandra to exclude nodes
 that aren't part of the ring from the schema consistency requirements.  Any
 reason that wouldn't work?
 And aside from a possible code patch, any recommendations as to how I can
 best fix this given the current 8.4 release?

 On Thu, Oct 13, 2011 at 12:14 AM, Jonathan Ellis jbel...@gmail.com wrote:

 Does nodetool removetoken not work?

 On Thu, Oct 13, 2011 at 12:59 AM, Eric Czech e...@nextbigsound.com
 wrote:
  Not sure if anyone has seen this before but it's really killing me right
  now.  Perhaps that was too long of a description of the issue so here's
  a
  more succinct question -- How do I remove nodes associated with a
  cluster
  that contain no data and have no reason to be associated with the
  cluster
  whatsoever?
  My last resort here is to stop cassandra (after recording all tokens for
  each node), set the initial token for each node in the cluster in
  cassandra.yaml, manually delete the LocationInfo* sstables in the system
  keyspace, and then restart.  I'm hoping there's a simpler, less
  seemingly
  risky way to do this so please, please let me know if that's true!
  Thanks again.
  - Eric
  On Tue, Oct 11, 2011 at 11:55 AM, Eric Czech e...@nextbigsound.com
  wrote:
 
  Hi, I'm having what I think is a fairly uncommon schema issue --
  My situation is that I had a cluster with 10 nodes and a consistent
  schema.  Then, in an experiment to setup a second cluster with the same
  information (by copying the raw sstables), I left the LocationInfo*
  sstables
  in the system keyspace in the new cluster and after starting the second
  cluster, I realized that the two clusters were discovering each other
  when
  they shouldn't have been.  Since then, I changed the cluster name for
  the
  second cluster and made sure to delete the LocationInfo* sstables
  before
  starting it and the two clusters are now operating independent of one
  another for the most part.  The only remaining connection between the
  two
  seems to be that the first cluster is still maintaining references to
  nodes
  in the second cluster in the schema versions despite those nodes not
  actually being part of the ring.
  Here's what my describe cluster looks like on the original cluster:
  Cluster Information:
     Snitch: org.apache.cassandra.locator.SimpleSnitch
     Partitioner: org.apache.cassandra.dht.RandomPartitioner
     Schema versions:
  48971cb0-e9ff-11e0--eb9eab7d90bf: [INTENTIONAL_IP1,
  INTENTIONAL_IP2, ..., INTENTIONAL_IP10]
  848bcfc0-eddf-11e0--8a3bb58f08ff: [NOT_INTENTIONAL_IP1,
  NOT_INTENTIONAL_IP2]
  The second cluster, however, contains no schema versions involving
  nodes
  from the first cluster.
  My question then is, how can I remove those schema versions from the
  original cluster that are associated with the unwanted nodes from the
  second
  cluster?  Is there any way to remove or evict an IP from a cluster
  instead
  of just a 

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau

Hi Zach,
thanks for that good idea. Unfortunately, our list needs to be rewritten 
often because our data is far away from being evenly distributed.


However, we could get this under control but there is a more severe 
problem: Random access is very hard to implement on a structure with 
undefined distances between two following index numbers. We absolutely 
need random access because the lists are too big to do this on the 
application side :-(


Kind regards
Matthias

On 10/13/2011 02:30 PM, Zach Richardson wrote:

Matthias,

This is an interesting problem.

I would consider using long's as the column type, where your column
names are evenly distributed longs in sort order when you first write
your list out.  So if you have items A and C with the long column
names 1000 and 2000, and then you have to insert B, it gets inserted
at 1500.  Once you run out of room between any two column name
entries, i.e 1000, 1001, 1002 entries are all taken at any spot in the
list, go ahead and re-write the list.

If your unencrypted data is uniformly distributed, you will have very
few collisions on your column names and should not have to re-write
the list to often.

If your lists are small enough, then you could use ints to save space,
but will then have to re-write the list more often.

Thanks,

Zach

On Thu, Oct 13, 2011 at 2:47 AM, Matthias Pfaup...@l3s.de  wrote:

Hi Stephen,
this is a great idea but unfortunately doesn't work for us either as we can
not store the data in an unencrypted form.

Kind regards
Matthias

On 10/12/2011 07:42 PM, Stephen Connolly wrote:


could you prefix the data with 3-4 bytes of a linear hash of the
unencypted data? it wouldn't be a perfect sort, but you'd have less of a
range to query to get the sorted values?

- Stephen

---
Sent from my Android phone, so random spelling mistakes, random nonsense
words and other nonsense are a direct result of using swype to type on
the screen

On 12 Oct 2011 17:57, Matthias Pfaup...@l3s.demailto:p...@l3s.de
wrote:

Unfortunately, that is not an option as we have to store the data in
an compressed and encrypted and therefore binary and non-sortable form.

On 10/12/2011 06:39 PM, David McNelis wrote:

Is it an option to not convert the data to binary prior to
inserting
into Cassandra?  Also, how large are the strings you're sorting?
  If its
viable to not convert to binary before writing to Cassandra, and
you use
one of the string based column ordering techniques (utf8, ascii,
for
example), then the data would be sorted without you  needing to
specifically worry about that.  Of course, if the strings are
lengthy
you could run into  additional issues.

On Wed, Oct 12, 2011 at 11:34 AM, Matthias Pfaup...@l3s.de
mailto:p...@l3s.de
mailto:p...@l3s.demailto:p...@l3s.de  wrote:

Hi there,
we are currently building a prototype based on cassandra and
came
into problems on implementing sorted lists containing
millions of items.

The special thing about the items of our lists is, that
cassandra is
not able to sort them as the data is stored in a binary
format which
is not sortable. However, we are able to sort the data
before the
plain data gets encoded (our application is responsible for
the order).

First Approach: Storing Lists in ColumnFamilies
***
We first tried to map the list to a single row of a
ColumnFamily in
a way that the index of the list is mapped to the column
names and
the items of the list to the column values. The column names
are
increasing numbers which define the sort order.
This has the major drawback that big parts of the list have
to be
rewritten on inserts (because the column names are numbered
by their
index), which are quite common.


Second Approach: Storing the whole List as Binary Data:
***
We tried to store the compressed list in a single column.
However,
this is only feasible for smaller lists. Our lists are far
to big
leading to multi megabyte reads and writes. As we need to
read and
update the lists quite often, this would put our Cassandra
cluster
under a lot of pressure.

Ideal Solution: Native support for storing lists
***
We would be very happy with a way to store a list of sorted
values
without making improper use of column names for the list
index. This
implies that we would need a possibility to insert values at
defined
positions. We know that this could lead to problems with
concurrent
inserts in a distributed 

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
Nope, there was definitely no intersection of the seed nodes between the two
clusters so I'm fairly certain that the second cluster found out about the
first through what was in the LocationInfo* system tables.  Also, I don't
think that procedure will really help because I don't actually want the
schema on cass-analysis-1 to be consistent with the schema in the original
cluster -- I just want to totally remove it.

On Thu, Oct 13, 2011 at 8:01 AM, Mohit Anchlia mohitanch...@gmail.comwrote:

 Do you have same seed node specified in cass-analysis-1 as cass-1,2,3?
 I am thinking that changing the seed node in cass-analysis-2 and
 following the directions in
 http://wiki.apache.org/cassandra/FAQ#schema_disagreement might solve
 the problem. Somone please correct me.

 On Thu, Oct 13, 2011 at 12:05 AM, Eric Czech e...@nextbigsound.com
 wrote:
  I don't think that's what I'm after here since the unwanted nodes were
  originally assimilated into the cluster with the same initial_token
 values
  as other nodes that were already in the cluster (that have, and still do
  have, useful data).  I know this is an awkward situation so I'll try to
  depict it in a simpler way:
  Let's say I have a simplified version of our production cluster that
 looks
  like this -
  cass-1   token = A
  cass-2   token = B
  cass-3   token = C
  Then I tried to create a second cluster that looks like this -
  cass-analysis-1   token = A  (and contains same data as cass-1)
  cass-analysis-2   token = B  (and contains same data as cass-2)
  cass-analysis-3   token = C  (and contains same data as cass-3)
  But after starting the second cluster, things got crossed up between the
  clusters and here's what the original cluster now looks like -
  cass-1   token = A   (has data and schema)
  cass-2   token = B   (has data and schema)
  cass-3   token = C   (had data and schema)
  cass-analysis-1   token = A  (has *no* data and is not part of the ring,
 but
  is trying to be included in cluster schema)
  A simplified version of describe cluster  for the original cluster now
  shows:
  Cluster Information:
 Schema versions:
  SCHEMA-UUID-1: [cass-1, cass-2, cass-3]
  SCHEMA-UUID-2: [cass-analysis-1]
  But the simplified ring looks like this (has only 3 nodes instead of 4):
  Host   Owns Token
  cass-1 33%   A
  cass-2 33%   B
  cass-3 33%   C
  The original cluster is still working correctly but all live schema
 updates
  are failing because of the inconsistent schema versions introduced by the
  unwanted node.
  From my perspective, a simple fix seems to be for cassandra to exclude
 nodes
  that aren't part of the ring from the schema consistency requirements.
  Any
  reason that wouldn't work?
  And aside from a possible code patch, any recommendations as to how I can
  best fix this given the current 8.4 release?
 
  On Thu, Oct 13, 2011 at 12:14 AM, Jonathan Ellis jbel...@gmail.com
 wrote:
 
  Does nodetool removetoken not work?
 
  On Thu, Oct 13, 2011 at 12:59 AM, Eric Czech e...@nextbigsound.com
  wrote:
   Not sure if anyone has seen this before but it's really killing me
 right
   now.  Perhaps that was too long of a description of the issue so
 here's
   a
   more succinct question -- How do I remove nodes associated with a
   cluster
   that contain no data and have no reason to be associated with the
   cluster
   whatsoever?
   My last resort here is to stop cassandra (after recording all tokens
 for
   each node), set the initial token for each node in the cluster in
   cassandra.yaml, manually delete the LocationInfo* sstables in the
 system
   keyspace, and then restart.  I'm hoping there's a simpler, less
   seemingly
   risky way to do this so please, please let me know if that's true!
   Thanks again.
   - Eric
   On Tue, Oct 11, 2011 at 11:55 AM, Eric Czech e...@nextbigsound.com
   wrote:
  
   Hi, I'm having what I think is a fairly uncommon schema issue --
   My situation is that I had a cluster with 10 nodes and a consistent
   schema.  Then, in an experiment to setup a second cluster with the
 same
   information (by copying the raw sstables), I left the LocationInfo*
   sstables
   in the system keyspace in the new cluster and after starting the
 second
   cluster, I realized that the two clusters were discovering each other
   when
   they shouldn't have been.  Since then, I changed the cluster name for
   the
   second cluster and made sure to delete the LocationInfo* sstables
   before
   starting it and the two clusters are now operating independent of one
   another for the most part.  The only remaining connection between the
   two
   seems to be that the first cluster is still maintaining references to
   nodes
   in the second cluster in the schema versions despite those nodes not
   actually being part of the ring.
   Here's what my describe cluster looks like on the original cluster:
   Cluster Information:
  Snitch: org.apache.cassandra.locator.SimpleSnitch
  

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Brandon Williams
You're running into https://issues.apache.org/jira/browse/CASSANDRA-3259

Try upgrading and doing a rolling restart.

-Brandon

On Thu, Oct 13, 2011 at 9:11 AM, Eric Czech e...@nextbigsound.com wrote:
 Nope, there was definitely no intersection of the seed nodes between the two
 clusters so I'm fairly certain that the second cluster found out about the
 first through what was in the LocationInfo* system tables.  Also, I don't
 think that procedure will really help because I don't actually want the
 schema on cass-analysis-1 to be consistent with the schema in the original
 cluster -- I just want to totally remove it.

 On Thu, Oct 13, 2011 at 8:01 AM, Mohit Anchlia mohitanch...@gmail.com
 wrote:

 Do you have same seed node specified in cass-analysis-1 as cass-1,2,3?
 I am thinking that changing the seed node in cass-analysis-2 and
 following the directions in
 http://wiki.apache.org/cassandra/FAQ#schema_disagreement might solve
 the problem. Somone please correct me.

 On Thu, Oct 13, 2011 at 12:05 AM, Eric Czech e...@nextbigsound.com
 wrote:
  I don't think that's what I'm after here since the unwanted nodes were
  originally assimilated into the cluster with the same initial_token
  values
  as other nodes that were already in the cluster (that have, and still do
  have, useful data).  I know this is an awkward situation so I'll try to
  depict it in a simpler way:
  Let's say I have a simplified version of our production cluster that
  looks
  like this -
  cass-1   token = A
  cass-2   token = B
  cass-3   token = C
  Then I tried to create a second cluster that looks like this -
  cass-analysis-1   token = A  (and contains same data as cass-1)
  cass-analysis-2   token = B  (and contains same data as cass-2)
  cass-analysis-3   token = C  (and contains same data as cass-3)
  But after starting the second cluster, things got crossed up between the
  clusters and here's what the original cluster now looks like -
  cass-1   token = A   (has data and schema)
  cass-2   token = B   (has data and schema)
  cass-3   token = C   (had data and schema)
  cass-analysis-1   token = A  (has *no* data and is not part of the ring,
  but
  is trying to be included in cluster schema)
  A simplified version of describe cluster  for the original cluster now
  shows:
  Cluster Information:
     Schema versions:
  SCHEMA-UUID-1: [cass-1, cass-2, cass-3]
  SCHEMA-UUID-2: [cass-analysis-1]
  But the simplified ring looks like this (has only 3 nodes instead of 4):
  Host       Owns     Token
  cass-1     33%       A
  cass-2     33%       B
  cass-3     33%       C
  The original cluster is still working correctly but all live schema
  updates
  are failing because of the inconsistent schema versions introduced by
  the
  unwanted node.
  From my perspective, a simple fix seems to be for cassandra to exclude
  nodes
  that aren't part of the ring from the schema consistency requirements.
   Any
  reason that wouldn't work?
  And aside from a possible code patch, any recommendations as to how I
  can
  best fix this given the current 8.4 release?
 
  On Thu, Oct 13, 2011 at 12:14 AM, Jonathan Ellis jbel...@gmail.com
  wrote:
 
  Does nodetool removetoken not work?
 
  On Thu, Oct 13, 2011 at 12:59 AM, Eric Czech e...@nextbigsound.com
  wrote:
   Not sure if anyone has seen this before but it's really killing me
   right
   now.  Perhaps that was too long of a description of the issue so
   here's
   a
   more succinct question -- How do I remove nodes associated with a
   cluster
   that contain no data and have no reason to be associated with the
   cluster
   whatsoever?
   My last resort here is to stop cassandra (after recording all tokens
   for
   each node), set the initial token for each node in the cluster in
   cassandra.yaml, manually delete the LocationInfo* sstables in the
   system
   keyspace, and then restart.  I'm hoping there's a simpler, less
   seemingly
   risky way to do this so please, please let me know if that's true!
   Thanks again.
   - Eric
   On Tue, Oct 11, 2011 at 11:55 AM, Eric Czech e...@nextbigsound.com
   wrote:
  
   Hi, I'm having what I think is a fairly uncommon schema issue --
   My situation is that I had a cluster with 10 nodes and a consistent
   schema.  Then, in an experiment to setup a second cluster with the
   same
   information (by copying the raw sstables), I left the LocationInfo*
   sstables
   in the system keyspace in the new cluster and after starting the
   second
   cluster, I realized that the two clusters were discovering each
   other
   when
   they shouldn't have been.  Since then, I changed the cluster name
   for
   the
   second cluster and made sure to delete the LocationInfo* sstables
   before
   starting it and the two clusters are now operating independent of
   one
   another for the most part.  The only remaining connection between
   the
   two
   seems to be that the first cluster is still maintaining references
   to
   nodes
   in the second 

Re: [Solved] column index offset miscalculation

2011-10-13 Thread Thomas Richter
Thanks for the hint.

Ticket created: https://issues.apache.org/jira/browse/CASSANDRA-3358

Best,

Thomas

On 10/13/2011 03:27 PM, Sylvain Lebresne wrote:
 JIRA is not read-only, you should be able to create a ticket at
 https://issues.apache.org/jira/browse/CASSANDRA, though
 that probably require that you create an account.
 
 --
 Sylvain
 
 On Thu, Oct 13, 2011 at 3:20 PM, Thomas Richter t...@tricnet.de wrote:
 Hi Aaron,

 the fix does the trick. I wonder why nobody else ran into this before...
 I checked org/apache/cassandra/db/ColumnIndexer.java in 0.7.9, 0.8.7 and
 1.0.0-rc2 and all seem to be affected.

 Looks like public Jira is readonly - so I'm not sure about how to continue.

 Best,

 Thomas

 On 10/13/2011 10:52 AM, Thomas Richter wrote:
 Hi Aaron,

 I guess i found it :-).

 I added logging for the used IndexInfo to
 SSTableNamesIterator.readIndexedColumns and got negative index postions
 for the missing columns. This is the reason why the columns are not
 loaded from sstable.

 So I had a look at ColumnIndexer.serializeInternal and there it is:

 int endPosition = 0, startPosition = -1;

 Should be:

 long endPosition = 0, startPosition = -1;

 I'm currently running a compaction with a fixed version to verify.

 Best,

 Thomas

 On 10/12/2011 11:54 PM, aaron morton wrote:
 Sounds a lot like the column is deleted.

 IIRC this is where the columns from various SSTables are reduced
 https://github.com/apache/cassandra/blob/cassandra-0.8/src/java/org/apache/cassandra/db/filter/QueryFilter.java#L117

 The call to ColumnFamily.addColumn() is where the column instance may be 
 merged with other instances.

 A

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 13/10/2011, at 5:33 AM, Thomas Richter wrote:

 Hi Aaron,

 I cannot read the column with a slice query.
 The slice query only returns data till a certain column and after that i
 only get empty results.

 I added log output to QueryFilter.isRelevant to see if the filter is
 dropping the column(s) but it doesn't even show up there.

 Next thing i will check check is the diff between columns contained in
 json export and columns fetched with the slice query, maybe this gives
 more clue...

 Any other ideas where to place more debugging output to see what's
 happening?

 Best,

 Thomas

 On 10/11/2011 12:46 PM, aaron morton wrote:
 kewl,

 * Row is not deleted (other columns can be read, row survives compaction
 with GCGraceSeconds=0)

 IIRC row tombstones can hang around for a while (until gc grace has 
 passed), and they only have an effect on columns that have a lower 
 timstamp. So it's possible to read columns from a row with a tombstone.

 Can you read the column using a slice range rather than specifying it's 
 name ?

 Aaron

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 11/10/2011, at 11:15 PM, Thomas Richter wrote:

 Hi Aaron,

 i invalidated the caches but nothing changed. I didn't get the mentioned
 log line either, but as I read the code SliceByNamesReadCommand uses
 NamesQueryFilter and not SliceQueryFilter.

 Next, there is only one SSTable.

 I can rule out that the row is deleted because I deleted all other rows
 in that CF to reduce data size and speed up testing. I set
 GCGraceSeconds to zero and ran a compaction. All other rows are gone,
 but i can still access at least one column from the left row.
 So as far as I understand it, there should not be a tombstone on row 
 level.

 To make it a list:

 * One SSTable, one row
 *
 * Row is not deleted (other columns can be read, row survives compaction
 with GCGraceSeconds=0)
 * Most columns can be read by get['row']['col'] from cassandra-cli
 * Some columns can not be read by get['row']['col'] from cassandra-cli
 but can be found in output of sstable2json
 * unreadable data survives compaction with GCGraceSeconds=0 (checked
 with sstable2json)
 * Invalidation caches does not help
 * Nothing in the logs

 Does that point into any direction where i should look next?

 Best,

 Thomas

 On 10/11/2011 10:30 AM, aaron morton wrote:
 Nothing jumps out. The obvious answer is that the column has been 
 deleted. Did you check all the SSTables ?

 It looks like query returned from row cache, otherwise you would see 
 this as well…

 DEBUG [ReadStage:34] 2011-10-11 21:11:11,484 SliceQueryFilter.java 
 (line 123) collecting 0 of 2147483647: 
 1318294191654059:false:354@1318294191654861

 Which would mean a version of the column was found.

 If you invalidate the cache with nodetool and run the query and the 
 log message appears it will mean the column was read from (all of the) 
 sstables. If you do not get a column returned I would say there is a 
 tombstone in place. It's either a row level or a column level one.

 Hope that helps.

 -
 

Re: supercolumns vs. prefixing columns of same data type?

2011-10-13 Thread Dean Hiller
great video, thanks!

On Thu, Oct 13, 2011 at 7:45 AM, hani elabed hani.ela...@gmail.com wrote:

 Hi Dean,
 I don't have have an answer to your question, but just in case you haven't
 seen this screencast by Ed Anuff on Cassandra Indexes, it helped me a lot.
 http://blip.tv/datastax/indexing-in-cassandra-5495633

 Hani


 On Wed, Oct 12, 2011 at 12:18 PM, Dean Hiller d...@alvazan.com wrote:

 I heard cassandra may be going the direction of removing super column and
 users are starting to just use prefixes in front of the column.

 The reason I ask is I was going the way of only using supercolumns and
 then many tables were fixed with just one supercolumn per row as the
 structure for that table was simplethis kept the api we have on top of
 Hector extremely simple not having to deal with columns vs. supercolumns.
 What are people's thoughts on this?

 Dealing in columnfamilies where some have supercolumns and some don't I
 think personally is a painful way to go.going with just one way and
 sticking with it sure makes the apis easier and it's much easier to apply
 AOP type stuff to that ONE insert method rather than having two insert
 methods.  So what is the direction of casssandra project and the
 recommendation?

 thanks,
 Dean





Re: Hector Problem Basic one

2011-10-13 Thread Patricio Echagüe
Hi, Hector does not retry on a down server. In the unit tests where you have
just one server, Hector will pass the exception to the client.

Can you tell us please what your test looks like ?

2011/10/12 Wangpei (Peter) peter.wang...@huawei.com

  I only saw this error message when all Cassandra nodes are down.

 How you get the Cluster and how you set the hosts?

 ** **

 *发件人:* CASSANDRA learner [mailto:cassandralear...@gmail.com]
 *发送时间:* 2011年10月12日 14:30
 *收件人:* user@cassandra.apache.org
 *主题:* Re: Hector Problem Basic one

 ** **

 Thanks for the reply ben.

 Actually The problem is, I could not able to run a basic hector example
 from eclipse. Its throwing me.prettyprint.hector.api.

 exceptions.HectorException: All host pools marked
  down. Retry burden pushed out to client

 
 Can you please let me know why i am getting this

 

 On Tue, Oct 11, 2011 at 3:54 PM, Ben Ashton b...@bossastudios.com wrote:*
 ***

 Hey,

 We had this one, even tho in the hector documentation it says that it
 retry s failed servers even 30 by default, it doesn't.

 Once we explicitly set it to X seconds, when ever there is a failure,
 ie with network (AWS), it will retry and add it back into the pool.

 Ben


 On 11 October 2011 11:09, CASSANDRA learner cassandralear...@gmail.com
 wrote:
  Hi Every One,
 
  Actually I was using cassandra long time back and when i tried today, I
 am
  getting a problem from eclipse. When i am trying to run a basic hector
  (java) example, I am getting an exception
  me.prettyprint.hector.api.exceptions.HectorException: All host pools
 marked
  down. Retry burden pushed out to client. . But My server is up. Node tool
  also whows that it is up. I donno what happens..
 
  1.)Is it any thing to do with JMX port.
  2.) What is the storage port in casandra.yaml and jmx port in
  cassandra-env.sh
 
 
 

 ** **



RE: MapReduce with two ethernet cards

2011-10-13 Thread Scott Fines
I upgraded to cassandra 0.8.7, and the problem persists.

Scott

From: Brandon Williams [dri...@gmail.com]
Sent: Monday, October 10, 2011 12:28 PM
To: user@cassandra.apache.org
Subject: Re: MapReduce with two ethernet cards

On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote:
 Hi all,
 This may be a silly question, but I'm at a bit of a loss, and was hoping for
 some help.
 I have a Cassandra cluster set up with two NICs--one for internel
 communication between cassandra machines (10.1.1.*), and one to respond to
 Thrift RPC (172.28.*.*).
 I also have a Hadoop cluster set up, which, for unrelated reasons, has to
 remain separate from Cassandra, so I've written a little MapReduce job to
 copy data from Cassandra to Hadoop. However, when I try to run my job, I
 get
 java.io.IOException: failed connecting to all endpoints
 10.1.1.24,10.1.1.17,10.1.1.16
 which is puzzling to me. It seems like the MR is attempting to connect to
 the internal communication IPs instead of the external Thrift IPs. Since I
 set up a firewall to block external access to the internal IPs of Cassandra,
 this is obviously going to fail.
 So my question is: why does Cassandra MR seem to be grabbing the
 listen_address instead of the Thrift one. Presuming it's not a funky
 configuration error or something on my part, is that strictly necessary? All
 told, I'd prefer if it was connecting to the Thrift IPs, but if it can't,
 should I open up port 7000 or port 9160 between Hadoop and Cassandra?
 Thanks for your help,
 Scott

Your cassandra is old, upgrade to the latest version.

-Brandon


Re: Efficiency of hector's setRowCount

2011-10-13 Thread Patricio Echagüe
Hi Don. No it will not. IndexedSlicesQuery will read just the amount of rows
specified by RowCount and will go to the DB to get the new page when needed.

SetRowCount is doing indexClause.setCount(rowCount);

On Mon, Oct 10, 2011 at 3:52 PM, Don Smith dsm...@likewise.com wrote:

 Hector's IndexedSlicesQuery has a setRowCount method that you can use to
 page through the results, as described in https://github.com/rantav/**
 hector/wiki/User-Guide https://github.com/rantav/hector/wiki/User-Guide.

 rangeSlicesQuery.setRowCount(**1001);
  .
 rangeSlicesQuery.setKeys(**lastRow.getKey(),  );

 Is it efficient?  Specifically, suppose my query returns 100,000 results
 and I page through batches of 1000 at a time (making 100 executes of the
 query). Will it internally retrieve all the results each time (but pass only
 the desired set of 1000 or so to me)? Or will it optimize queries to avoid
 the duplication?  I presume the latter. :)

 Can IndexedSlicesQuery's setStartKey method be used for the same effect?

   Thanks,  Don



Re: Efficiency of hector's setRowCount (and setStartKey!)

2011-10-13 Thread Don Smith
It's actually setStartKey that's the important method call (in 
combination with setRowCount). So I should have been clearer.


The following code performs as expected, as far as returning the 
expected data in the expected order.  I believe that the use of 
IndexedSliceQuery's setStartKey will support efficient queries -- 
avoiding repulling the entire data set from cassandra. Correct?



void demoPaging() {
String lastKey = processPage(don,);  // get first 
batch, starting with  (smallest key)
lastKey = processPage(don,lastKey);// get second 
batch starting with previous last key
lastKey = processPage(don,lastKey);// get third 
batch starting with previous last key

   //
}

// return last key processed, null when no records left
String processPage(String username, String startKey) {
String lastKey=null;
IndexedSlicesQueryString, String, String 
indexedSlicesQuery =

HFactory.createIndexedSlicesQuery(keyspace, stringSerializer, 
stringSerializer, stringSerializer);

indexedSlicesQuery.addEqualsExpression(user, username);

indexedSlicesQuery.setColumnNames(source,ip);

indexedSlicesQuery.setColumnFamily(ourColumnFamilyName);

indexedSlicesQuery.setStartKey(startKey);   // 


indexedSlicesQuery.setRowCount(batchSize);
QueryResultOrderedRowsString, String, 
String result =indexedSlicesQuery.execute();
OrderedRowsString,String,String rows 
= result.get();

for(RowString,String,String row:rows ){
if (row==null) { continue; }
totalCount++;
String key = row.getKey();

if (!startKey.equals(key)) 
{lastKey=key;}

}
totalCount--;
return lastKey;
}






On 10/13/2011 09:15 AM, Patricio Echagüe wrote:
Hi Don. No it will not. IndexedSlicesQuery will read just the amount 
of rows specified by RowCount and will go to the DB to get the new 
page when needed.


SetRowCount is doing indexClause.setCount(rowCount);

On Mon, Oct 10, 2011 at 3:52 PM, Don Smith dsm...@likewise.com 
mailto:dsm...@likewise.com wrote:


Hector's IndexedSlicesQuery has a setRowCount method that you can
use to page through the results, as described in
https://github.com/rantav/hector/wiki/User-Guide .

rangeSlicesQuery.setRowCount(1001);
 .
rangeSlicesQuery.setKeys(lastRow.getKey(),  );

Is it efficient?  Specifically, suppose my query returns 100,000
results and I page through batches of 1000 at a time (making 100
executes of the query). Will it internally retrieve all the
results each time (but pass only the desired set of 1000 or so to
me)? Or will it optimize queries to avoid the duplication?  I
presume the latter. :)

Can IndexedSlicesQuery's setStartKey method be used for the same
effect?

  Thanks,  Don






Re: Efficiency of hector's setRowCount (and setStartKey!)

2011-10-13 Thread Patricio Echagüe
On Thu, Oct 13, 2011 at 9:39 AM, Don Smith dsm...@likewise.com wrote:

 **
 It's actually setStartKey that's the important method call (in combination
 with setRowCount). So I should have been clearer.

 The following code performs as expected, as far as returning the expected
 data in the expected order.  I believe that the use of IndexedSliceQuery's
 setStartKey will support efficient queries -- avoiding repulling the entire
 data set from cassandra. Correct?


correct



 void demoPaging() {
 String lastKey = processPage(don,);  // get first
 batch, starting with  (smallest key)
 lastKey = processPage(don,lastKey);// get second
 batch starting with previous last key
 lastKey = processPage(don,lastKey);// get third batch
 starting with previous last key
//
 }

 // return last key processed, null when no records left
 String processPage(String username, String startKey) {
 String lastKey=null;
 IndexedSlicesQueryString, String, String
 indexedSlicesQuery =
 HFactory.createIndexedSlicesQuery(keyspace,
 stringSerializer, stringSerializer, stringSerializer);

 indexedSlicesQuery.addEqualsExpression(user, username);

 indexedSlicesQuery.setColumnNames(source,ip);

 indexedSlicesQuery.setColumnFamily(ourColumnFamilyName);
 indexedSlicesQuery.setStartKey(startKey);
 //
 
 indexedSlicesQuery.setRowCount(batchSize);
 QueryResultOrderedRowsString, String,
 String result =indexedSlicesQuery.execute();
 OrderedRowsString,String,String rows =
 result.get();
 for(RowString,String,String row:rows ){
 if (row==null) { continue; }
 totalCount++;
 String key = row.getKey();

 if (!startKey.equals(key))
 {lastKey=key;}
 }
 totalCount--;
 return lastKey;
 }






 On 10/13/2011 09:15 AM, Patricio Echagüe wrote:

 Hi Don. No it will not. IndexedSlicesQuery will read just the amount of
 rows specified by RowCount and will go to the DB to get the new page when
 needed.

  SetRowCount is doing indexClause.setCount(rowCount);

 On Mon, Oct 10, 2011 at 3:52 PM, Don Smith dsm...@likewise.com wrote:

 Hector's IndexedSlicesQuery has a setRowCount method that you can use to
 page through the results, as described in
 https://github.com/rantav/hector/wiki/User-Guide .

 rangeSlicesQuery.setRowCount(1001);
  .
 rangeSlicesQuery.setKeys(lastRow.getKey(),  );

 Is it efficient?  Specifically, suppose my query returns 100,000 results
 and I page through batches of 1000 at a time (making 100 executes of the
 query). Will it internally retrieve all the results each time (but pass only
 the desired set of 1000 or so to me)? Or will it optimize queries to avoid
 the duplication?  I presume the latter. :)

 Can IndexedSlicesQuery's setStartKey method be used for the same effect?

   Thanks,  Don






Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau

Hi Zach,
thanks for your additional input. You are absolutely right: The long 
namespace should be big enough. We are going to insert up to 2^32 values 
into the list.


We only need support for get(index), insert(index) and remove(index) 
while get and insert will be used very often. Remove is also needed but 
used very rare.


Kind regards
Matthias

On 10/13/2011 04:49 PM, Zach Richardson wrote:

Matthias,

Answers below.

On Thu, Oct 13, 2011 at 9:03 AM, Matthias Pfaup...@l3s.de  wrote:

Hi Zach,
thanks for that good idea. Unfortunately, our list needs to be rewritten
often because our data is far away from being evenly distributed.


This shouldn't be a problem if you use long's.  If you were to space
them at original write (with N objects) at a distance of
Long.MAX_VALUE / N, and N was 10,000,000 you could still fit another
1844674407370 entries in between.


However, we could get this under control but there is a more severe problem:
Random access is very hard to implement on a structure with undefined
distances between two following index numbers. We absolutely need random
access because the lists are too big to do this on the application side :-(


I'm guessing you need to be able to implement all of the traditional
get(index), set(index), insert(index) type operations on the list.
Once you start trying to do that, you start to hit all of the same
problems you get with different in memory list implementations based
on which operation is most important.

Could you provide some more information on what operations will be
performed the most, and how important they are.  I think that would
help anyone recommend a path to take.

Zach


Kind regards
Matthias

On 10/13/2011 02:30 PM, Zach Richardson wrote:


Matthias,

This is an interesting problem.

I would consider using long's as the column type, where your column
names are evenly distributed longs in sort order when you first write
your list out.  So if you have items A and C with the long column
names 1000 and 2000, and then you have to insert B, it gets inserted
at 1500.  Once you run out of room between any two column name
entries, i.e 1000, 1001, 1002 entries are all taken at any spot in the
list, go ahead and re-write the list.

If your unencrypted data is uniformly distributed, you will have very
few collisions on your column names and should not have to re-write
the list to often.

If your lists are small enough, then you could use ints to save space,
but will then have to re-write the list more often.

Thanks,

Zach

On Thu, Oct 13, 2011 at 2:47 AM, Matthias Pfaup...@l3s.dewrote:


Hi Stephen,
this is a great idea but unfortunately doesn't work for us either as we
can
not store the data in an unencrypted form.

Kind regards
Matthias

On 10/12/2011 07:42 PM, Stephen Connolly wrote:


could you prefix the data with 3-4 bytes of a linear hash of the
unencypted data? it wouldn't be a perfect sort, but you'd have less of a
range to query to get the sorted values?

- Stephen

---
Sent from my Android phone, so random spelling mistakes, random nonsense
words and other nonsense are a direct result of using swype to type on
the screen

On 12 Oct 2011 17:57, Matthias Pfaup...@l3s.demailto:p...@l3s.de
wrote:

Unfortunately, that is not an option as we have to store the data in
an compressed and encrypted and therefore binary and non-sortable
form.

On 10/12/2011 06:39 PM, David McNelis wrote:

Is it an option to not convert the data to binary prior to
inserting
into Cassandra?  Also, how large are the strings you're sorting?
  If its
viable to not convert to binary before writing to Cassandra, and
you use
one of the string based column ordering techniques (utf8, ascii,
for
example), then the data would be sorted without you  needing to
specifically worry about that.  Of course, if the strings are
lengthy
you could run into  additional issues.

On Wed, Oct 12, 2011 at 11:34 AM, Matthias Pfaup...@l3s.de
mailto:p...@l3s.de
mailto:p...@l3s.demailto:p...@l3s.dewrote:

Hi there,
we are currently building a prototype based on cassandra and
came
into problems on implementing sorted lists containing
millions of items.

The special thing about the items of our lists is, that
cassandra is
not able to sort them as the data is stored in a binary
format which
is not sortable. However, we are able to sort the data
before the
plain data gets encoded (our application is responsible for
the order).

First Approach: Storing Lists in ColumnFamilies
***
We first tried to map the list to a single row of a
ColumnFamily in
a way that the index of the list is mapped to the column
names and
the items of the list to the column values. The column names

Re: MapReduce with two ethernet cards

2011-10-13 Thread Brandon Williams
What is your rpc_address set to?  If it's 0.0.0.0 (bind everything)
then that's not going to work if listen_address is blocked.

-Brandon

On Thu, Oct 13, 2011 at 11:13 AM, Scott Fines scott.fi...@nisc.coop wrote:
 I upgraded to cassandra 0.8.7, and the problem persists.

 Scott
 
 From: Brandon Williams [dri...@gmail.com]
 Sent: Monday, October 10, 2011 12:28 PM
 To: user@cassandra.apache.org
 Subject: Re: MapReduce with two ethernet cards

 On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote:
 Hi all,
 This may be a silly question, but I'm at a bit of a loss, and was hoping for
 some help.
 I have a Cassandra cluster set up with two NICs--one for internel
 communication between cassandra machines (10.1.1.*), and one to respond to
 Thrift RPC (172.28.*.*).
 I also have a Hadoop cluster set up, which, for unrelated reasons, has to
 remain separate from Cassandra, so I've written a little MapReduce job to
 copy data from Cassandra to Hadoop. However, when I try to run my job, I
 get
 java.io.IOException: failed connecting to all endpoints
 10.1.1.24,10.1.1.17,10.1.1.16
 which is puzzling to me. It seems like the MR is attempting to connect to
 the internal communication IPs instead of the external Thrift IPs. Since I
 set up a firewall to block external access to the internal IPs of Cassandra,
 this is obviously going to fail.
 So my question is: why does Cassandra MR seem to be grabbing the
 listen_address instead of the Thrift one. Presuming it's not a funky
 configuration error or something on my part, is that strictly necessary? All
 told, I'd prefer if it was connecting to the Thrift IPs, but if it can't,
 should I open up port 7000 or port 9160 between Hadoop and Cassandra?
 Thanks for your help,
 Scott

 Your cassandra is old, upgrade to the latest version.

 -Brandon



RE: MapReduce with two ethernet cards

2011-10-13 Thread Scott Fines
The listen address on all machines are set to the 10.1.1.* addresses, while the 
thrift rpc address is the 172.28.* addresses


From: Brandon Williams [dri...@gmail.com]
Sent: Thursday, October 13, 2011 12:28 PM
To: user@cassandra.apache.org
Subject: Re: MapReduce with two ethernet cards

What is your rpc_address set to?  If it's 0.0.0.0 (bind everything)
then that's not going to work if listen_address is blocked.

-Brandon

On Thu, Oct 13, 2011 at 11:13 AM, Scott Fines scott.fi...@nisc.coop wrote:
 I upgraded to cassandra 0.8.7, and the problem persists.

 Scott
 
 From: Brandon Williams [dri...@gmail.com]
 Sent: Monday, October 10, 2011 12:28 PM
 To: user@cassandra.apache.org
 Subject: Re: MapReduce with two ethernet cards

 On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote:
 Hi all,
 This may be a silly question, but I'm at a bit of a loss, and was hoping for
 some help.
 I have a Cassandra cluster set up with two NICs--one for internel
 communication between cassandra machines (10.1.1.*), and one to respond to
 Thrift RPC (172.28.*.*).
 I also have a Hadoop cluster set up, which, for unrelated reasons, has to
 remain separate from Cassandra, so I've written a little MapReduce job to
 copy data from Cassandra to Hadoop. However, when I try to run my job, I
 get
 java.io.IOException: failed connecting to all endpoints
 10.1.1.24,10.1.1.17,10.1.1.16
 which is puzzling to me. It seems like the MR is attempting to connect to
 the internal communication IPs instead of the external Thrift IPs. Since I
 set up a firewall to block external access to the internal IPs of Cassandra,
 this is obviously going to fail.
 So my question is: why does Cassandra MR seem to be grabbing the
 listen_address instead of the Thrift one. Presuming it's not a funky
 configuration error or something on my part, is that strictly necessary? All
 told, I'd prefer if it was connecting to the Thrift IPs, but if it can't,
 should I open up port 7000 or port 9160 between Hadoop and Cassandra?
 Thanks for your help,
 Scott

 Your cassandra is old, upgrade to the latest version.

 -Brandon



RE: MapReduce with two ethernet cards

2011-10-13 Thread Scott Fines
When I look at the source for ColumnFamilyInputFormat, it appears that it does 
a call to client.describe_ring; when you do the equivalent call  with nodetool, 
you get the 10.1.1.* addresses.  This seems to indicate to me that I should 
open up the firewall and attempt to contact those IPs instead of the normal 
thrift IPs. 

That leads me to think that I need to have thrift listening on both IPs, 
though. Would that then be the case?

Scott

From: Scott Fines [scott.fi...@nisc.coop]
Sent: Thursday, October 13, 2011 12:40 PM
To: user@cassandra.apache.org
Subject: RE: MapReduce with two ethernet cards

The listen address on all machines are set to the 10.1.1.* addresses, while the 
thrift rpc address is the 172.28.* addresses


From: Brandon Williams [dri...@gmail.com]
Sent: Thursday, October 13, 2011 12:28 PM
To: user@cassandra.apache.org
Subject: Re: MapReduce with two ethernet cards

What is your rpc_address set to?  If it's 0.0.0.0 (bind everything)
then that's not going to work if listen_address is blocked.

-Brandon

On Thu, Oct 13, 2011 at 11:13 AM, Scott Fines scott.fi...@nisc.coop wrote:
 I upgraded to cassandra 0.8.7, and the problem persists.

 Scott
 
 From: Brandon Williams [dri...@gmail.com]
 Sent: Monday, October 10, 2011 12:28 PM
 To: user@cassandra.apache.org
 Subject: Re: MapReduce with two ethernet cards

 On Mon, Oct 10, 2011 at 11:47 AM, Scott Fines scott.fi...@nisc.coop wrote:
 Hi all,
 This may be a silly question, but I'm at a bit of a loss, and was hoping for
 some help.
 I have a Cassandra cluster set up with two NICs--one for internel
 communication between cassandra machines (10.1.1.*), and one to respond to
 Thrift RPC (172.28.*.*).
 I also have a Hadoop cluster set up, which, for unrelated reasons, has to
 remain separate from Cassandra, so I've written a little MapReduce job to
 copy data from Cassandra to Hadoop. However, when I try to run my job, I
 get
 java.io.IOException: failed connecting to all endpoints
 10.1.1.24,10.1.1.17,10.1.1.16
 which is puzzling to me. It seems like the MR is attempting to connect to
 the internal communication IPs instead of the external Thrift IPs. Since I
 set up a firewall to block external access to the internal IPs of Cassandra,
 this is obviously going to fail.
 So my question is: why does Cassandra MR seem to be grabbing the
 listen_address instead of the Thrift one. Presuming it's not a funky
 configuration error or something on my part, is that strictly necessary? All
 told, I'd prefer if it was connecting to the Thrift IPs, but if it can't,
 should I open up port 7000 or port 9160 between Hadoop and Cassandra?
 Thanks for your help,
 Scott

 Your cassandra is old, upgrade to the latest version.

 -Brandon



Re: Storing pre-sorted data

2011-10-13 Thread Stephen Connolly
in theory, however they have less than 32 bits of entropy from which they
can do that, leaving them with at least 32 more bits of combinations to
try... that's 2 billion or so... must be a big dictionary

- Stephen

---
Sent from my Android phone, so random spelling mistakes, random nonsense
words and other nonsense are a direct result of using swype to type on the
screen
On 13 Oct 2011 17:57, Matthias Pfau p...@l3s.de wrote:

 Hi Stephen,
 this sounds very reasonable. But wouldn't this enable an attacker to
 execute dictionary attacks in order to decrypt the first 8 bytes of the
 plain text?

 Kind regards
 Matthias

 On 10/13/2011 05:03 PM, Stephen Connolly wrote:

 It wouldn't be unencrypted... which is the point

 you use a one way linear hash function to take the first, say 8 bytes,
 of unencrypted data and turn it into 4 bytes of a sort prefix.

 You've used lost half the data in the process, so effectively each bit
 is an OR of two bits and you can only infer from 0 values... so data
 is still encrypted, but you have an approximate sorting.

 For example, if your data is US-ASCII text with no numbers, you could
 use Soundex to get the pre-key, so that worst case you have a bucket
 of values in the range.

 Using this technique, a random get will have to get the values at the
 desired prefix +/- a small amount rather than the whole row... on the
 client side you can then decrypt the data and sort that small bucket
 to get the correct index position.

 You could do a 1 byte prefix, but that only gives you at best 256
 buckets and assumes that the first 2 bytes are uniformly
 distributed... you've said your data is not uniformly distributed, so
 a linear hash function sounds like your best bet.

 your hash function should have the property that hash(A)= hash(B) if
 and only if A= B

 On 13 October 2011 08:47, Matthias Pfaup...@l3s.de  wrote:

 Hi Stephen,
 this is a great idea but unfortunately doesn't work for us either as we
 can
 not store the data in an unencrypted form.

 Kind regards
 Matthias

 On 10/12/2011 07:42 PM, Stephen Connolly wrote:


 could you prefix the data with 3-4 bytes of a linear hash of the
 unencypted data? it wouldn't be a perfect sort, but you'd have less of a
 range to query to get the sorted values?

 - Stephen

 ---
 Sent from my Android phone, so random spelling mistakes, random nonsense
 words and other nonsense are a direct result of using swype to type on
 the screen

 On 12 Oct 2011 17:57, Matthias 
 Pfaup...@l3s.demailto:pfau@**l3s.dep...@l3s.de
 
 wrote:

Unfortunately, that is not an option as we have to store the data in
an compressed and encrypted and therefore binary and non-sortable
 form.

On 10/12/2011 06:39 PM, David McNelis wrote:

Is it an option to not convert the data to binary prior to
 inserting
into Cassandra?  Also, how large are the strings you're sorting?
  If its
viable to not convert to binary before writing to Cassandra, and
you use
one of the string based column ordering techniques (utf8, ascii,
 for
example), then the data would be sorted without you  needing to
specifically worry about that.  Of course, if the strings are
lengthy
you could run into  additional issues.

On Wed, Oct 12, 2011 at 11:34 AM, Matthias Pfaup...@l3s.de
mailto:p...@l3s.de
mailto:p...@l3s.demailto:pfa**u...@l3s.de p...@l3s.de
  wrote:

Hi there,
we are currently building a prototype based on cassandra and
came
into problems on implementing sorted lists containing
millions of items.

The special thing about the items of our lists is, that
cassandra is
not able to sort them as the data is stored in a binary
format which
is not sortable. However, we are able to sort the data
before the
plain data gets encoded (our application is responsible for
the order).

First Approach: Storing Lists in ColumnFamilies
***
We first tried to map the list to a single row of a
ColumnFamily in
a way that the index of the list is mapped to the column
names and
the items of the list to the column values. The column names
 are
increasing numbers which define the sort order.
This has the major drawback that big parts of the list have
to be
rewritten on inserts (because the column names are numbered
by their
index), which are quite common.


Second Approach: Storing the whole List as Binary Data:
***
We tried to store the compressed list in a single column.
However,
this is only feasible for smaller lists. Our lists are far
to big
leading to multi megabyte reads and writes. As we need to
read and
update the lists quite 

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau

Hi Stephen,
we are hashing the first 8 byte (8 US-ASCII characters) of text that has 
been written by humans. Wouldn't it be easy for the attacker to do a 
dictionary attack on this text, especially if he knows the language of 
the text?


Kind regards
Matthias

On 10/13/2011 08:20 PM, Stephen Connolly wrote:

in theory, however they have less than 32 bits of entropy from which
they can do that, leaving them with at least 32 more bits of
combinations to try... that's 2 billion or so... must be a big dictionary

- Stephen

---
Sent from my Android phone, so random spelling mistakes, random nonsense
words and other nonsense are a direct result of using swype to type on
the screen

On 13 Oct 2011 17:57, Matthias Pfau p...@l3s.de mailto:p...@l3s.de
wrote:

Hi Stephen,
this sounds very reasonable. But wouldn't this enable an attacker to
execute dictionary attacks in order to decrypt the first 8 bytes
of the plain text?

Kind regards
Matthias

On 10/13/2011 05:03 PM, Stephen Connolly wrote:

It wouldn't be unencrypted... which is the point

you use a one way linear hash function to take the first, say 8
bytes,
of unencrypted data and turn it into 4 bytes of a sort prefix.

You've used lost half the data in the process, so effectively
each bit
is an OR of two bits and you can only infer from 0 values... so data
is still encrypted, but you have an approximate sorting.

For example, if your data is US-ASCII text with no numbers, you
could
use Soundex to get the pre-key, so that worst case you have a bucket
of values in the range.

Using this technique, a random get will have to get the values
at the
desired prefix +/- a small amount rather than the whole row...
on the
client side you can then decrypt the data and sort that small bucket
to get the correct index position.

You could do a 1 byte prefix, but that only gives you at best 256
buckets and assumes that the first 2 bytes are uniformly
distributed... you've said your data is not uniformly
distributed, so
a linear hash function sounds like your best bet.

your hash function should have the property that hash(A)=
hash(B) if
and only if A= B

On 13 October 2011 08:47, Matthias Pfaup...@l3s.de
mailto:p...@l3s.de  wrote:

Hi Stephen,
this is a great idea but unfortunately doesn't work for us
either as we can
not store the data in an unencrypted form.

Kind regards
Matthias

On 10/12/2011 07:42 PM, Stephen Connolly wrote:


could you prefix the data with 3-4 bytes of a linear
hash of the
unencypted data? it wouldn't be a perfect sort, but
you'd have less of a
range to query to get the sorted values?

- Stephen

---
Sent from my Android phone, so random spelling mistakes,
random nonsense
words and other nonsense are a direct result of using
swype to type on
the screen

On 12 Oct 2011 17:57, Matthias Pfaup...@l3s.de
mailto:p...@l3s.demailto:pfau@__l3s.de
mailto:p...@l3s.de
wrote:

Unfortunately, that is not an option as we have to
store the data in
an compressed and encrypted and therefore binary and
non-sortable form.

On 10/12/2011 06:39 PM, David McNelis wrote:

Is it an option to not convert the data to
binary prior to
inserting
into Cassandra?  Also, how large are the strings
you're sorting?
  If its
viable to not convert to binary before writing
to Cassandra, and
you use
one of the string based column ordering
techniques (utf8, ascii,
for
example), then the data would be sorted without
you  needing to
specifically worry about that.  Of course, if
the strings are
lengthy
you could run into  additional issues.

On Wed, Oct 12, 2011 at 11:34 AM, Matthias
Pfaup...@l3s.de mailto:p...@l3s.de
mailto:p...@l3s.de mailto:p...@l3s.de
mailto:p...@l3s.de
mailto:p...@l3s.demailto:pfa...@l3s.de
mailto:p...@l3s.de  wrote:

Hi there,
we are currently building a prototype based
   

Re: Storing pre-sorted data

2011-10-13 Thread Stephen Connolly
Then just use a soundex function on the first word in the text... that
will shrink it sufficiently and give nice buckets in near sequential
order (http://en.wikipedia.org/wiki/Soundex)

On 13 October 2011 21:21, Matthias Pfau p...@l3s.de wrote:
 Hi Stephen,
 we are hashing the first 8 byte (8 US-ASCII characters) of text that has
 been written by humans. Wouldn't it be easy for the attacker to do a
 dictionary attack on this text, especially if he knows the language of the
 text?

 Kind regards
 Matthias

 On 10/13/2011 08:20 PM, Stephen Connolly wrote:

 in theory, however they have less than 32 bits of entropy from which
 they can do that, leaving them with at least 32 more bits of
 combinations to try... that's 2 billion or so... must be a big dictionary

 - Stephen

 ---
 Sent from my Android phone, so random spelling mistakes, random nonsense
 words and other nonsense are a direct result of using swype to type on
 the screen

 On 13 Oct 2011 17:57, Matthias Pfau p...@l3s.de mailto:p...@l3s.de
 wrote:

    Hi Stephen,
    this sounds very reasonable. But wouldn't this enable an attacker to
    execute dictionary attacks in order to decrypt the first 8 bytes
    of the plain text?

    Kind regards
    Matthias

    On 10/13/2011 05:03 PM, Stephen Connolly wrote:

        It wouldn't be unencrypted... which is the point

        you use a one way linear hash function to take the first, say 8
        bytes,
        of unencrypted data and turn it into 4 bytes of a sort prefix.

        You've used lost half the data in the process, so effectively
        each bit
        is an OR of two bits and you can only infer from 0 values... so
 data
        is still encrypted, but you have an approximate sorting.

        For example, if your data is US-ASCII text with no numbers, you
        could
        use Soundex to get the pre-key, so that worst case you have a
 bucket
        of values in the range.

        Using this technique, a random get will have to get the values
        at the
        desired prefix +/- a small amount rather than the whole row...
        on the
        client side you can then decrypt the data and sort that small
 bucket
        to get the correct index position.

        You could do a 1 byte prefix, but that only gives you at best 256
        buckets and assumes that the first 2 bytes are uniformly
        distributed... you've said your data is not uniformly
        distributed, so
        a linear hash function sounds like your best bet.

        your hash function should have the property that hash(A)=
        hash(B) if
        and only if A= B

        On 13 October 2011 08:47, Matthias Pfaup...@l3s.de
        mailto:p...@l3s.de  wrote:

            Hi Stephen,
            this is a great idea but unfortunately doesn't work for us
            either as we can
            not store the data in an unencrypted form.

            Kind regards
            Matthias

            On 10/12/2011 07:42 PM, Stephen Connolly wrote:


                could you prefix the data with 3-4 bytes of a linear
                hash of the
                unencypted data? it wouldn't be a perfect sort, but
                you'd have less of a
                range to query to get the sorted values?

                - Stephen

                ---
                Sent from my Android phone, so random spelling mistakes,
                random nonsense
                words and other nonsense are a direct result of using
                swype to type on
                the screen

                On 12 Oct 2011 17:57, Matthias Pfaup...@l3s.de
                mailto:p...@l3s.demailto:pfau@__l3s.de
                mailto:p...@l3s.de
                wrote:

                    Unfortunately, that is not an option as we have to
                store the data in
                    an compressed and encrypted and therefore binary and
                non-sortable form.

                    On 10/12/2011 06:39 PM, David McNelis wrote:

                        Is it an option to not convert the data to
                binary prior to
                inserting
                        into Cassandra?  Also, how large are the strings
                you're sorting?
                          If its
                        viable to not convert to binary before writing
                to Cassandra, and
                        you use
                        one of the string based column ordering
                techniques (utf8, ascii,
                for
                        example), then the data would be sorted without
                you  needing to
                        specifically worry about that.  Of course, if
                the strings are
                        lengthy
                        you could run into  additional issues.

                        On Wed, Oct 12, 2011 at 11:34 AM, Matthias
                Pfaup...@l3s.de mailto:p...@l3s.de
                mailto:p...@l3s.de 

Re: Cassandra as session store under heavy load

2011-10-13 Thread Jonathan Ellis
Or upgrade to 1.0 and use leveled compaction
(http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra)

On Thu, Oct 13, 2011 at 4:28 PM, aaron morton aa...@thelastpickle.com wrote:
 They only have a minimum time, gc_grace_seconds for deletes.

 If you want to be really watch disk space reduce the compaction thresholds on 
 the CF.

 Or run a major compaction as part of maintenance.

 cheers

 -
 Aaron Morton
 Freelance Cassandra Developer
 @aaronmorton
 http://www.thelastpickle.com

 On 13/10/2011, at 10:50 PM, Maciej Miklas wrote:

 durable_writes sounds great - thank you! I really do not need commit log 
 here.

 Another question: it is possible to configure live time of Tombstones?


 Regards,
 Maciej





-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Restore snapshots suggestion

2011-10-13 Thread Daning
If I need to restore snapshots from all nodes, but I can only shutdown 
one node a time since it is production, is there a way I can stop data 
syncing between nodes temporarily? I don't want the existing data 
overwrites the snapshot. I found this undocumented parameter  
DoConsistencyChecksBoolean(http://www.datastax.com/dev/blog/whats-new-cassandra-066) 
to disable read repair,  what is the proper way to do it?



I am on 0.8.6.

Thank you in advance,

Daning


Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
Thanks Brandon!  Out of curiosity, would making schema changes through a
thrift interface (via hector) be any different?  In other words, would using
hector instead of the cli make schema changes possible without upgrading?

On Thu, Oct 13, 2011 at 8:22 AM, Brandon Williams dri...@gmail.com wrote:

 You're running into https://issues.apache.org/jira/browse/CASSANDRA-3259

 Try upgrading and doing a rolling restart.

 -Brandon

 On Thu, Oct 13, 2011 at 9:11 AM, Eric Czech e...@nextbigsound.com wrote:
  Nope, there was definitely no intersection of the seed nodes between the
 two
  clusters so I'm fairly certain that the second cluster found out about
 the
  first through what was in the LocationInfo* system tables.  Also, I don't
  think that procedure will really help because I don't actually want the
  schema on cass-analysis-1 to be consistent with the schema in the
 original
  cluster -- I just want to totally remove it.
 
  On Thu, Oct 13, 2011 at 8:01 AM, Mohit Anchlia mohitanch...@gmail.com
  wrote:
 
  Do you have same seed node specified in cass-analysis-1 as cass-1,2,3?
  I am thinking that changing the seed node in cass-analysis-2 and
  following the directions in
  http://wiki.apache.org/cassandra/FAQ#schema_disagreement might solve
  the problem. Somone please correct me.
 
  On Thu, Oct 13, 2011 at 12:05 AM, Eric Czech e...@nextbigsound.com
  wrote:
   I don't think that's what I'm after here since the unwanted nodes were
   originally assimilated into the cluster with the same initial_token
   values
   as other nodes that were already in the cluster (that have, and still
 do
   have, useful data).  I know this is an awkward situation so I'll try
 to
   depict it in a simpler way:
   Let's say I have a simplified version of our production cluster that
   looks
   like this -
   cass-1   token = A
   cass-2   token = B
   cass-3   token = C
   Then I tried to create a second cluster that looks like this -
   cass-analysis-1   token = A  (and contains same data as cass-1)
   cass-analysis-2   token = B  (and contains same data as cass-2)
   cass-analysis-3   token = C  (and contains same data as cass-3)
   But after starting the second cluster, things got crossed up between
 the
   clusters and here's what the original cluster now looks like -
   cass-1   token = A   (has data and schema)
   cass-2   token = B   (has data and schema)
   cass-3   token = C   (had data and schema)
   cass-analysis-1   token = A  (has *no* data and is not part of the
 ring,
   but
   is trying to be included in cluster schema)
   A simplified version of describe cluster  for the original cluster
 now
   shows:
   Cluster Information:
  Schema versions:
   SCHEMA-UUID-1: [cass-1, cass-2, cass-3]
   SCHEMA-UUID-2: [cass-analysis-1]
   But the simplified ring looks like this (has only 3 nodes instead of
 4):
   Host   Owns Token
   cass-1 33%   A
   cass-2 33%   B
   cass-3 33%   C
   The original cluster is still working correctly but all live schema
   updates
   are failing because of the inconsistent schema versions introduced by
   the
   unwanted node.
   From my perspective, a simple fix seems to be for cassandra to exclude
   nodes
   that aren't part of the ring from the schema consistency requirements.
Any
   reason that wouldn't work?
   And aside from a possible code patch, any recommendations as to how I
   can
   best fix this given the current 8.4 release?
  
   On Thu, Oct 13, 2011 at 12:14 AM, Jonathan Ellis jbel...@gmail.com
   wrote:
  
   Does nodetool removetoken not work?
  
   On Thu, Oct 13, 2011 at 12:59 AM, Eric Czech e...@nextbigsound.com
   wrote:
Not sure if anyone has seen this before but it's really killing me
right
now.  Perhaps that was too long of a description of the issue so
here's
a
more succinct question -- How do I remove nodes associated with a
cluster
that contain no data and have no reason to be associated with the
cluster
whatsoever?
My last resort here is to stop cassandra (after recording all
 tokens
for
each node), set the initial token for each node in the cluster in
cassandra.yaml, manually delete the LocationInfo* sstables in the
system
keyspace, and then restart.  I'm hoping there's a simpler, less
seemingly
risky way to do this so please, please let me know if that's true!
Thanks again.
- Eric
On Tue, Oct 11, 2011 at 11:55 AM, Eric Czech 
 e...@nextbigsound.com
wrote:
   
Hi, I'm having what I think is a fairly uncommon schema issue --
My situation is that I had a cluster with 10 nodes and a
 consistent
schema.  Then, in an experiment to setup a second cluster with the
same
information (by copying the raw sstables), I left the
 LocationInfo*
sstables
in the system keyspace in the new cluster and after starting the
second
cluster, I realized that the two clusters were discovering each
other
when