Re: Thanks for all the fish.

2016-08-19 Thread Brian O'Neill
+1, props to the giant on whose shoulders we stand.
-- 
Brian O'Neill
Principal Architect @ Monetate
m: 215.588.6024
bone...@monetate.com <mailto:bone...@monetate.com>
Is desktop dead?  Find out in Monetate's Ecommerce Quarterly Report (Q1 2016) 
<http://info.monetate.com/EQ1_2016.html?utm_source=ibm_medium=email-footer_campaign=organic>
> On Aug 19, 2016, at 4:29 PM, Brandon Williams <dri...@gmail.com> wrote:
> 
> If there is one thing I am damn sure of, it's that I wouldn't be here
> without Jonathan's leadership and friendship.  Thank you for all you've
> done, old buddy.
> 
> Kind Regards,
> Brandon
> 
> On Fri, Aug 19, 2016 at 2:20 PM, Michael Kjellman <
> mkjell...@internalcircle.com> wrote:
> 
>> Just wanted to say thank you publicly to Jonathan Ellis for his tireless
>> work making this community and software what it is. He's always been level
>> headed and I certainly wouldn't be where I am without his leadership.
>> 
>> So, Jonathan, thanks for all the fish.
>> 
>> best,
>> kjellman
>> 
>> Sent from my iPhone
>> 



Re: Wrap around CQL queries for token ranges?

2015-05-11 Thread Brian O'Neill
Looks like the java-driver supplies the hack I need.  (TokenRange.unwrap)

I¹ll leave it to you guys to decide if it is more elegant to support
wrapping natively in CQL.

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Brian O'Neill b...@alumni.brown.edu
Date:  Monday, May 11, 2015 at 12:32 PM
To:  dev@cassandra.apache.org dev@cassandra.apache.org
Subject:  Wrap around CQL queries for token ranges?


I was doing some testing around data locality today (and adding it to our
distributed processing layer).
I retrieved all of the TokenRanges back using:
tokenRanges = metadata.getTokenRanges(keyspace, localhost)


And when I spun through the token ranges returned, I ended up missing
records.  
The root cause was the ³edge case² where the ring wraps around.

It generated the following CQL query: (using the last token range)

cqlsh SELECT token(id),id,name FROM test_keyspace.test_table WHERE
token(id)8743874685407455894 AND token(id)=-8851282698028303387;

(0 rows)

cqlsh SELECT token(id),id,name FROM test_keyspace.test_table WHERE
token(id)=-8851282698028303387 AND token(id)-9223372036854775808;

 token(id)| id | name
--++
 -9157060164899361011 | 23 | name23
 -9108684050423740263 | 53 | name53
 -9084883821289052775 | 91 | name91
(3 rows)

NOTE: If I use Long.MAX_VALUE instead, I get the records.

I can hack this at the app layer, to issue separate queries for the wrap
around case, but I wonder if CQL should support wrap around queries???

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 




Wrap around CQL queries for token ranges?

2015-05-11 Thread Brian O'Neill

I was doing some testing around data locality today (and adding it to our
distributed processing layer).
I retrieved all of the TokenRanges back using:
tokenRanges = metadata.getTokenRanges(keyspace, localhost)


And when I spun through the token ranges returned, I ended up missing
records.  
The root cause was the ³edge case² where the ring wraps around.

It generated the following CQL query: (using the last token range)

cqlsh SELECT token(id),id,name FROM test_keyspace.test_table WHERE
token(id)8743874685407455894 AND token(id)=-8851282698028303387;

(0 rows)

cqlsh SELECT token(id),id,name FROM test_keyspace.test_table WHERE
token(id)=-8851282698028303387 AND token(id)-9223372036854775808;

 token(id)| id | name
--++
 -9157060164899361011 | 23 | name23
 -9108684050423740263 | 53 | name53
 -9084883821289052775 | 91 | name91
(3 rows)

NOTE: If I use Long.MAX_VALUE instead, I get the records.

I can hack this at the app layer, to issue separate queries for the wrap
around case, but I wonder if CQL should support wrap around queries???

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 




Re: Conditional Update Code?

2015-03-04 Thread Brian O'Neill

Finally getting to this...

For the UDF, javascript?

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile • @boneill42 http://www.twitter.com/boneill42

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 




On 2/6/15, 9:50 AM, Benedict Elliott Smith belliottsm...@datastax.com
wrote:

It's quite possible support could be added to evaluate a UDF as part of
the
condition check. The code you're looking for are implementors of
CASRequest.appliesTo(), in CQL3CasRequest and
CassandraServer.ThriftCASRequest

It seems like https://issues.apache.org/jira/browse/CASSANDRA-8488 would
offer the basic functionality, except that it is expected to require ALLOW
FILTERING, which is unlikely to be permitted for a CAS operation, since
the
implication is that the work is too expensive for normal use. Such a
constraint is probably not necessary if a clustering prefix is provided,
though (i.e. a full CQL row key).

On Fri, Feb 6, 2015 at 2:38 PM, Brian O'Neill b...@alumni.brown.edu
wrote:


 All,

 I¹m just looking for a little directionŠ

 Anyone know where I can find the code that checks the condition in a
 conditional update?
 We¹d love to have more expressive conditions, beyond just equality.
(e.g.
 column contains? value)

 I wanted to see how hard this would be to contribute.
 Is such a JIRA already open?

 -brian

 ---
 Brian O'Neill
 Chief Technology Officer
 Health Market Science, a LexisNexis Company
 215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42


 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged material.
If
 you received this email in error and are not the intended recipient, or
the
 person responsible to deliver it to the intended recipient, please
contact
 the sender at the email above and delete this email and any attachments
and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.








Re: Conditional Update Code?

2015-03-04 Thread Brian O'Neill

Interesting, I just saw the function definition stuff in AggregationTest.

I’ll dig in there.  It seems like we could re-use those functions for
conditional updates?

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile • @boneill42 http://www.twitter.com/boneill42

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 




On 3/4/15, 12:50 PM, Brian O'Neill b...@alumni.brown.edu wrote:


Finally getting to this...

For the UDF, javascript?

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile • @boneill42 http://www.twitter.com/boneill42

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material.
If 
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 




On 2/6/15, 9:50 AM, Benedict Elliott Smith belliottsm...@datastax.com
wrote:

It's quite possible support could be added to evaluate a UDF as part of
the
condition check. The code you're looking for are implementors of
CASRequest.appliesTo(), in CQL3CasRequest and
CassandraServer.ThriftCASRequest

It seems like https://issues.apache.org/jira/browse/CASSANDRA-8488 would
offer the basic functionality, except that it is expected to require
ALLOW
FILTERING, which is unlikely to be permitted for a CAS operation, since
the
implication is that the work is too expensive for normal use. Such a
constraint is probably not necessary if a clustering prefix is provided,
though (i.e. a full CQL row key).

On Fri, Feb 6, 2015 at 2:38 PM, Brian O'Neill b...@alumni.brown.edu
wrote:


 All,

 I¹m just looking for a little directionŠ

 Anyone know where I can find the code that checks the condition in a
 conditional update?
 We¹d love to have more expressive conditions, beyond just equality.
(e.g.
 column contains? value)

 I wanted to see how hard this would be to contribute.
 Is such a JIRA already open?

 -brian

 ---
 Brian O'Neill
 Chief Technology Officer
 Health Market Science, a LexisNexis Company
 215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42


 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged
material. 
If
 you received this email in error and are not the intended recipient,
or 
the
 person responsible to deliver it to the intended recipient, please
contact
 the sender at the email above and delete this email and any
attachments 
and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.







Conditional Update Code?

2015-02-06 Thread Brian O'Neill

All,

I¹m just looking for a little directionŠ

Anyone know where I can find the code that checks the condition in a
conditional update?
We¹d love to have more expressive conditions, beyond just equality.  (e.g.
column contains? value)

I wanted to see how hard this would be to contribute.
Is such a JIRA already open?

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 




Re: Conditional Update Code?

2015-02-06 Thread Brian O'Neill

Perfect. Thanks.

Let me see what I can cook up as a PoC.

The specific use case we are looking to address is for real-time
aggregations, done in memory, then periodically flushed to C*.  (e.g.
every 30 seconds)
(similar to what Druid does, but native on top of C*)

In this scenario, we aggregate app-side for a specific time
slice/partition of data.  We want to update the aggregate value only if
that time slice/partition has not already been incorporated into the
value.  If we have a native way to check to see if the partition was
already incorporated as part of the conditional update, it will simplify
the app layer.

-brian

---
Brian O'Neill 
Chief Technology Officer
Health Market Science, a LexisNexis Company
215.588.6024 Mobile • @boneill42 http://www.twitter.com/boneill42

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 




On 2/6/15, 9:50 AM, Benedict Elliott Smith belliottsm...@datastax.com
wrote:

It's quite possible support could be added to evaluate a UDF as part of
the
condition check. The code you're looking for are implementors of
CASRequest.appliesTo(), in CQL3CasRequest and
CassandraServer.ThriftCASRequest

It seems like https://issues.apache.org/jira/browse/CASSANDRA-8488 would
offer the basic functionality, except that it is expected to require ALLOW
FILTERING, which is unlikely to be permitted for a CAS operation, since
the
implication is that the work is too expensive for normal use. Such a
constraint is probably not necessary if a clustering prefix is provided,
though (i.e. a full CQL row key).

On Fri, Feb 6, 2015 at 2:38 PM, Brian O'Neill b...@alumni.brown.edu
wrote:


 All,

 I¹m just looking for a little directionŠ

 Anyone know where I can find the code that checks the condition in a
 conditional update?
 We¹d love to have more expressive conditions, beyond just equality.
(e.g.
 column contains? value)

 I wanted to see how hard this would be to contribute.
 Is such a JIRA already open?

 -brian

 ---
 Brian O'Neill
 Chief Technology Officer
 Health Market Science, a LexisNexis Company
 215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42


 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged material.
If
 you received this email in error and are not the intended recipient, or
the
 person responsible to deliver it to the intended recipient, please
contact
 the sender at the email above and delete this email and any attachments
and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.








Re: Refactoring cassandra service package

2014-06-03 Thread Brian O'Neill

Interesting proposition.  We¹ve embedded Cassandra a few times, so I¹d be
interested in an approach that makes that easier.

Is there a way to do it incrementally?  Introduce the injection framework,
and convert a few classes (those required for startup), then slowly
convert the remainder?

peanut gallery,
-brian

---
Brian O'Neill
Chief Technology Officer

Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42  €
healthmarketscience.com

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 






On 6/3/14, 1:59 PM, Gary Dusbabek gdusba...@gmail.com wrote:

On Tue, Jun 3, 2014 at 3:52 AM, Simon Chemouil schemo...@gmail.com
wrote:

 Hi,

 I'm new to Cassandra and felt like exploring and hacking on the code. I
 was surprised to see the usage of so many mutable/global state statics
 all over the service package (basically global variables/singletons).

 While I understand it can be practical to work with singletons, and that
 in any case I'm not sure multi-tenant Cassandra (as in two different
 Cassandra instances within the same process) would make sense at all (or
 even work considering there is some native access going on with JNA), I
 find static state can easily lead to tangled 'spaghetti' code (accessing
 the singletons from anywhere, even where one shouldn't), and in general
 it ties the code to the VM instance, rather than to the class.

 I tried to find if it was an actual design choice, but from my
 understanding this is more something inherited from the early Cassandra
 times at Facebook. I just found this thread[1] pointing to issue
 CASSANDRA-741 (slightly more limited scope) that was marked as WONTFIX
 because no one took it (but still marked as open for contribution). The
 current code conventions also don't mention the usage of singletons
 except by stating:  Do not extract interfaces (or abstract classes)
 unless you actually need multiple implementations of it (switching to a
 service-style design doesn't require passing interfaces but it's
 highly encouraged to help testability).

 So, I'd like to try to make this refactoring happen and remove all (or
 most) mutable static state. It would be an easy way in for me in
 Cassandra's internals (maybe to contribute further). I think it would
 help testing (ability to unit test components without going to the
 storage for instance) and in general modernize the code. It would also
 make hacking on Cassandra easier because people could pick different
 pieces without pulling the whole thing.

 It would definitely break backwards compatibility with current Java code
 that directly embeds Cassandra / uses it as a library, but I would keep
 the same abstraction so the refactoring would be easy. In any case,
 backwards compatibility can be broken by many more changes than just
 refactoring, and once this is done it will be easier to deal with
 backwards compatibility.

 Obviously all .instance fields would be gone, and I'd try to fix
 potential cyclic class dependencies and generally make sure classes
 dependencies form a direct acyclic graph with CassandraDaemon as its
 root. The basic idea is to have each 'service' component require all its
 service dependencies in their constructor (and keeping them as a final
 field), rather than getting them via the global namespace (singleton
 instances).

 If I had it my way, I'd probably use a dependency injection framework,
 namely Dagger which is as far as I knpw the lightest Java DI framework
 actively developed (jointly developed by Square and Google's Java team
 responsible for Guice  Guava), which has a neat compile-time annotation
 processor that detects missing dependencies early on. It works with both
 Android and J2SE and is very fast, simple and light (65kB vs 710kB for
 Guice).

 So, the question is: would you guys accept such a patch? I'd rather not
 do the work if it has no chance of being merged upstream :).


This has come up before. Let's face it, removing the singletons is a
tempting proposition.

Several of us have been down the path of trying to do it.

At the end of the day, here's what you'd end up with (absolutely best
case):

1. Modifying just about every class, sometimes substantially.
2. A huge patch for someone else to review.
3. No performance gains, no bug fixes.  In fact, since so many classes
have
to be changed, I'd say

Re: NPE in conditional updates w/ collections in 2.0.7

2014-05-16 Thread Brian O'Neill
Perfect.  Thanks Tyler.

Great to hear you guys are already on top of it.  I’ll watch for the
resolution.

-brian

---
Brian O'Neill
Chief Technology Officer

Health Market Science
The Science of Better Results
2700 Horizon Drive • King of Prussia, PA • 19406
M: 215.588.6024 • @boneill42 http://www.twitter.com/boneill42  •
healthmarketscience.com

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 






On 5/16/14, 12:25 PM, Tyler Hobbs ty...@datastax.com wrote:

Hi Brian,

Thanks for the report.  This looks like
https://issues.apache.org/jira/browse/CASSANDRA-7155, which should be
fixed
shortly.


On Thu, May 15, 2014 at 3:23 PM, Brian O'Neill
b...@alumni.brown.eduwrote:


 OK ‹ we¹ve got some hyper data modeling going on, taking advantage of
all
 the latest toys in CQL 2.  And we ran into some trouble using maps
within
 conditional updates.  Specifically, when testing to see if a key exists
in
 a
 map (with =null?), we encounter an NPE server-side.  We believe this
worked
 in 2.0.4.

 With this schema:
 CREATE TABLE progress (
 key text,
 count int,
 partitions maptext, timestamp,
 primary key (key)
 );

 When executing the following:
 cqlsh:hms UPDATE foo SET count=4962 WHERE key='PA' IF
 partitions['a']=null;

  [applied]
 ---
  False

 cqlsh:hms UPDATE foo SET count=4962 WHERE key='PA';
 cqlsh:hms UPDATE foo SET count=4962 WHERE key='PA' IF
 partitions['a']=null;
 TSocket read 0 bytes

 We see the following NPE server-side:
 ERROR [Native-Transport-Requests:13353] 2014-05-15 15:10:00,154
 QueryMessage.java (line 131) Unexpected error during query
 java.lang.NullPointerException
 at

 
org.apache.cassandra.cql3.ColumnCondition$WithVariables.collectionApplies
To(
 ColumnCondition.java:168)
 at

 
org.apache.cassandra.cql3.ColumnCondition$WithVariables.appliesTo(ColumnC
ond
 ition.java:142)
 at

 
org.apache.cassandra.cql3.statements.CQL3CasConditions$ColumnsConditions.
app
 liesTo(CQL3CasConditions.java:197)
 at

 
org.apache.cassandra.cql3.statements.CQL3CasConditions.appliesTo(CQL3CasC
ond
 itions.java:108)

 Is there a better way to test for existence of a key?
 Or is this a bug?  (Regardless, we may want to protect against the NPE)
 Or am I missing something entirely?

 -brian

 ---
 Brian O'Neill
 Chief Technology Officer


 Health Market Science
 The Science of Better Results
 2700 Horizon Drive € King of Prussia, PA € 19406
 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
 healthmarketscience.com


 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged material.
If
 you received this email in error and are not the intended recipient, or
the
 person responsible to deliver it to the intended recipient, please
contact
 the sender at the email above and delete this email and any attachments
and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.






-- 
Tyler Hobbs
DataStax http://datastax.com/




NPE in conditional updates w/ collections in 2.0.7

2014-05-16 Thread Brian O'Neill

OK ‹ we¹ve got some hyper data modeling going on, taking advantage of all
the latest toys in CQL 2.  And we ran into some trouble using maps within
conditional updates.  Specifically, when testing to see if a key exists in a
map (with =null?), we encounter an NPE server-side.  We believe this worked
in 2.0.4.

With this schema:
CREATE TABLE progress (
key text,
count int,
partitions maptext, timestamp,
primary key (key)
);

When executing the following:
cqlsh:hms UPDATE foo SET count=4962 WHERE key='PA' IF partitions['a']=null;

 [applied]
---
 False

cqlsh:hms UPDATE foo SET count=4962 WHERE key='PA';
cqlsh:hms UPDATE foo SET count=4962 WHERE key='PA' IF partitions['a']=null;
TSocket read 0 bytes

We see the following NPE server-side:
ERROR [Native-Transport-Requests:13353] 2014-05-15 15:10:00,154
QueryMessage.java (line 131) Unexpected error during query
java.lang.NullPointerException
at 
org.apache.cassandra.cql3.ColumnCondition$WithVariables.collectionAppliesTo(
ColumnCondition.java:168)
at 
org.apache.cassandra.cql3.ColumnCondition$WithVariables.appliesTo(ColumnCond
ition.java:142)
at 
org.apache.cassandra.cql3.statements.CQL3CasConditions$ColumnsConditions.app
liesTo(CQL3CasConditions.java:197)
at 
org.apache.cassandra.cql3.statements.CQL3CasConditions.appliesTo(CQL3CasCond
itions.java:108)

Is there a better way to test for existence of a key?
Or is this a bug?  (Regardless, we may want to protect against the NPE)
Or am I missing something entirely?

-brian

---
Brian O'Neill
Chief Technology Officer


Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 




[applied] column in ModificationStatement?

2014-02-06 Thread Brian O'Neill
Silly questionŠ

Using the CQL driver for conditional updates, I¹m looking into the ResultSet
that comes back:
for (ColumnDefinitions.Definition definition :
results.getColumnDefinitions().asList()) {
for (Row row : results.all()) {
LOG.debug(UPDATE APPLIED = [{}]=[{}],
definition.getName(), row.getBool(definition.getName()));
}
}

I noticed that the ResultSet of a conditional update contains a column
³[applied]², with a boolean indicating whether or not the update was
applied.

I assume this column name comes from:
src/java/org/apache/cassandra/cql3/statements/ModificationStatement.java:50
   private static final ColumnIdentifier CAS_RESULT_COLUMN = new
ColumnIdentifier([applied], false);

Does it make sense to expose this column name as a String constant
somewhere? 
Either in the CQL java-driver, or Cassandra itself?

-brian

---
Brian O'Neill
Chief Technology Officer


Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 




Re: [applied] column in ModificationStatement?

2014-02-06 Thread Brian O'Neill
Thanks Jonathan.  
It feels a little weird, but that will work.

Not a big deal, but maybe we could include a wasApplied() method on the
ResultSet in the future that would insulate clients from the ResultSet
schema/column name.

-brian


---
Brian O'Neill
Chief Technology Officer

Health Market Science
The Science of Better Results
2700 Horizon Drive • King of Prussia, PA • 19406
M: 215.588.6024 • @boneill42 http://www.twitter.com/boneill42  •
healthmarketscience.com

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 






On 2/6/14, 2:58 PM, Jonathan Ellis jbel...@gmail.com wrote:

In Cassandra, it's ModificationStatement.CAS_RESULT_COLUMN.text

On Thu, Feb 6, 2014 at 10:22 AM, Brian O'Neill b...@alumni.brown.edu
wrote:
 Silly questionŠ

 Using the CQL driver for conditional updates, I¹m looking into the
ResultSet
 that comes back:
 for (ColumnDefinitions.Definition definition :
 results.getColumnDefinitions().asList()) {
 for (Row row : results.all()) {
 LOG.debug(UPDATE APPLIED = [{}]=[{}],
 definition.getName(), row.getBool(definition.getName()));
 }
 }

 I noticed that the ResultSet of a conditional update contains a column
 ³[applied]², with a boolean indicating whether or not the update was
 applied.

 I assume this column name comes from:
 
src/java/org/apache/cassandra/cql3/statements/ModificationStatement.java:
50
private static final ColumnIdentifier CAS_RESULT_COLUMN = new
 ColumnIdentifier([applied], false);

 Does it make sense to expose this column name as a String constant
 somewhere?
 Either in the CQL java-driver, or Cassandra itself?

 -brian

 ---
 Brian O'Neill
 Chief Technology Officer


 Health Market Science
 The Science of Better Results
 2700 Horizon Drive € King of Prussia, PA € 19406
 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
 healthmarketscience.com


 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged material.
If
 you received this email in error and are not the intended recipient, or
the
 person responsible to deliver it to the intended recipient, please
contact
 the sender at the email above and delete this email and any attachments
and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.






-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder, http://www.datastax.com
@spyced




Re: Dimensional SUM, COUNT, DISTINCT in C* (replacing Acunu)

2013-12-18 Thread Brian O'Neill

Thanks for the pointer Alain.

At a quick glance, it looks like people are looking for query time
filtering/aggregation, which will suffice for small data sets.  Hopefully we
might be able to extend that to perform pre-computations as well. (which
would support much larger data sets / volumes)

I¹ll continue the discussion on the issue.

thanks again,
brian


---
Brian O'Neill
Chief Architect
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Alain RODRIGUEZ arodr...@gmail.com
Reply-To:  u...@cassandra.apache.org
Date:  Wednesday, December 18, 2013 at 5:13 AM
To:  u...@cassandra.apache.org
Cc:  dev@cassandra.apache.org dev@cassandra.apache.org
Subject:  Re: Dimensional SUM, COUNT,  DISTINCT in C* (replacing Acunu)

Hi, this would indeed be much appreciated by a lot of people.

There is this issue, existing about this subject

 https://issues.apache.org/jira/browse/CASSANDRA-4914

Maybe could you help commiters there.

Hope this will be usefull to you.

Please let us know when you find a way to do these operations.

Cheers.


2013/12/18 Brian O'Neill b...@alumni.brown.edu
 We are seeking to replace Acunu in our technology stack / platform.  It is the
 only component in our stack that is not open source.
 
 In preparation, over the last few weeks I¹ve migrated Virgil to CQL.   The
 vision is that Virgil could receive a REST request to upsert/delete data
 (hierarchical JSON to support collections).  Virgil would lookup the
 dimensions/aggregations for that table, add the key to the pertinent
 dimensional tables (e.g. DISTINCT), incorporate values into aggregations (e.g.
 SUMs) and increment/decrement relevant counters (COUNT).  (using additional
 CF¹s)
 
 This seems straight-forward, but appears to require a read-before-write.
 (e.g. read the current value of a SUM, incorporate the new value, then use the
 lightweight transactions of C* 2.0 to conditionally update the value.)
 
 Before I go down this path, I was wondering if anyone is designing/working on
 the same, perhaps at a lower level?  (CQL?)
 
 Is there any intent to support aggregations/filters (COUNT, SUM, DISTINCT,
 etc) at the CQL level?  If so, is there a preliminary design?
 
 I can see a lower-level approach, which would leverage the commit logs (and
 mem/sstables) and perform the aggregation during read-operations (and
 flush/compaction).
 
 thoughts?  i'm open to all ideas.
 
 -brian
 -- 
 Brian ONeill
 Chief Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024 tel:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42





Dimensional SUM, COUNT, DISTINCT in C* (replacing Acunu)

2013-12-17 Thread Brian O'Neill
We are seeking to replace Acunu in our technology stack / platform.  It is
the only component in our stack that is not open source.

In preparation, over the last few weeks I’ve migrated Virgil to CQL.   The
vision is that Virgil could receive a REST request to upsert/delete data
(hierarchical JSON to support collections).  Virgil would lookup the
dimensions/aggregations for that table, add the key to the pertinent
dimensional tables (e.g. DISTINCT), incorporate values into aggregations
(e.g. SUMs) and increment/decrement relevant counters (COUNT).  (using
additional CF’s)

This seems straight-forward, but appears to require a read-before-write.
 (e.g. read the current value of a SUM, incorporate the new value, then use
the lightweight transactions of C* 2.0 to conditionally update the value.)

Before I go down this path, I was wondering if anyone is designing/working
on the same, perhaps at a lower level?  (CQL?)

Is there any intent to support aggregations/filters (COUNT, SUM, DISTINCT,
etc) at the CQL level?  If so, is there a preliminary design?

I can see a lower-level approach, which would leverage the commit logs (and
mem/sstables) and perform the aggregation during read-operations (and
flush/compaction).

thoughts?  i'm open to all ideas.

-brian
-- 
Brian ONeill
Chief Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Submit enhancements via pull requests?

2013-12-05 Thread Brian O'Neill

Sorry guys, it¹s been a while since I submitted a patch.

I see there are a number of outstanding pull requests:
https://github.com/apache/cassandra/pulls

Are we able to submit enhancements via pull requests on github now?
Or are we still using JIRA + patches?

(I have a very minor change to an error message that I¹d like to get in
there)

thanks,
brian

---
Brian O'Neill
Chief Architect
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 




Re: Submit enhancements via pull requests?

2013-12-05 Thread Brian O'Neill
Thanks Jeremiah.  Done.
https://issues.apache.org/jira/browse/CASSANDRA-6453


-brian

---
Brian O'Neill
Chief Architect
Health Market Science
The Science of Better Results
2700 Horizon Drive • King of Prussia, PA • 19406
M: 215.588.6024 • @boneill42 http://www.twitter.com/boneill42  •
healthmarketscience.com

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 






On 12/5/13, 10:47 AM, Jeremiah D Jordan jerem...@datastax.com wrote:

JIRA + patch or link to git branch

-Jeremiah

On Dec 5, 2013, at 9:44 AM, Brian O'Neill b...@alumni.brown.edu wrote:

 
 Sorry guys, it¹s been a while since I submitted a patch.
 
 I see there are a number of outstanding pull requests:
 https://github.com/apache/cassandra/pulls
 
 Are we able to submit enhancements via pull requests on github now?
 Or are we still using JIRA + patches?
 
 (I have a very minor change to an error message that I¹d like to get in
 there)
 
 thanks,
 brian
 
 ---
 Brian O'Neill
 Chief Architect
 Health Market Science
 The Science of Better Results
 2700 Horizon Drive € King of Prussia, PA € 19406
 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
 healthmarketscience.com
 
 
 This information transmitted in this email message is for the intended
 recipient only and may contain confidential and/or privileged material.
If
 you received this email in error and are not the intended recipient, or
the
 person responsible to deliver it to the intended recipient, please
contact
 the sender at the email above and delete this email and any attachments
and
 destroy any copies thereof. Any review, retransmission, dissemination,
 copying or other use of, or taking any action in reliance upon, this
 information by persons or entities other than the intended recipient is
 strictly prohibited.
 
 
 





Re: Bitmap indexes - reviving CASSANDRA-1472

2013-04-12 Thread Brian O'Neill
@Jason,

I have a lot of experience with SOLR + ES, but mainly for search.  (i.e.
Finding the most relevant records given a query)
That's been working well, but now we have requirements to support
dashboards.  Those dashboards have aggregations in them (sum, average,
count(s), etc).  I have limited experience using filter functions and
facets to achieve similar things w/ Lucene, but they never seemed to
perform well when the sets were large.

If Lucene/SOLR/ES can support this kind of functionality, we'd gladly use
it instead. (Let me know!)

When we looked around, Druid seemed to fit the bill exactly: (and it was
open source)
http://metamarkets.com/2011/druid-part-i-real-time-analytics-at-a-billion-r
ows-per-second/

BTW, here is more information on the compression that Druid uses:
http://metamarkets.com/2012/druid-bitmap-compression/


To echo Matt's sentiment, we'd love to leverage a C* native capability for
this.
(Acunu provides most of the capability, but it isn't open source)

I think once we have the conditional write semantics that are coming, we
could layer this on top of C*. (extending the secondary indexes
functionality)

-brian



---
Brian O'Neill
Lead Architect, Software Development
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42  €
healthmarketscience.com

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 






On 4/12/13 12:46 AM, Matt Stump mrevilgn...@gmail.com wrote:

You could embed Lucene, but then you pretty much have DSE search, and
there
are people on this list in a better position than I to describe
the difficulty in making that scale. By rolling your own you get
simplicity
and control. If you use a uniform index size you can just assign chunks of
it to the cassandra ring making it easy to distribute queries. I think
that
using Lucene in this way would cause most of the benefit of the library to
be lost, and add unnecessary complexity. If Lucene were easy, then I think
given the team's experience with both Lucene and C* it would have been
done
already.

Sorry if it's a fuzzy answer, but I haven't run down every technical angle
on the integration with C* yet. The idea was still very much in the
wouldn't it be very cool if this thing lived in Cassandra. It would be the
nail in the coffin for impala, redshift, et al.


On Thu, Apr 11, 2013 at 3:15 PM, Jason Rutherglen 
jason.rutherg...@gmail.com wrote:

 What's the advantage over Lucene?


 On Wed, Apr 10, 2013 at 10:43 PM, Matt Stump mrevilgn...@gmail.com
 wrote:

  Druid was our inspiration to layer bitmap indexes on top of Cassandra.
  Druid doesn't work for us because or data set is too large. We would
need
  many hundreds of nodes just for the pre-processed data. What I
envisioned
  was the ability to perform druid style queries (no aggregation)
without
 the
  limitations imposed by having the entire dataset in memory. I
primarily
  need to query whether a user performed some event, but I also intend
to
 add
  trigram indexes for LIKE, ILIKE or possibly regex style matching.
 
  I wasn't aware of CONCISE, thanks for the pointer. We are currently
  evaluating fastbit, which is a very similar project:
  https://sdm.lbl.gov/fastbit/
 
 
  On Wed, Apr 10, 2013 at 5:49 PM, Brian O'Neill b...@alumni.brown.edu
  wrote:
 
  
   How does this compare with Druid?
   https://github.com/metamx/druid
  
   We're currently evaluating Acunu, Vertica and Druid...
  
  
 
 
http://brianoneill.blogspot.com/2013/04/bianalytics-on-big-datacassandra.
html
  
   With its bitmapped indexes, Druid appears to have the most
potential.
   They boast some pretty impressive stats, especially WRT handling
   real-time updates and adding new dimensions.
  
   They also use a compression algorithm, CONCISE, to cut down on the
 space
   requirements.
   http://ricerca.mat.uniroma3.it/users/colanton/concise.html
  
   I haven't looked too deep into the Druid code, but I've been
meaning to
   see if it could be backed by C*.
  
   We'd be game to join the hunt if you pursue such a beast. (with your
  code,
   or with portions of Druid)
  
   -brian
  
  
   On Apr 10, 2013, at 5:40 PM, mrevilgnome wrote:
  
What do you think about set manipulation via indexes in Cassandra?
 I'm
interested in answering queries such as give me all users that
  performed
event 1, 2, and 3, but not 4

Re: Compund/Composite column names

2013-01-09 Thread Brian O'Neill
Sorry, just got time to submit it.

Here you go:
https://issues.apache.org/jira/browse/CASSANDRA-5138

-brian

---
Brian O'Neill
Lead Architect, Software Development
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Sylvain Lebresne sylv...@datastax.com
Date:  Monday, December 17, 2012 10:35 AM
To:  dev@cassandra.apache.org
Cc:  Vivek Mishra vivek.mis...@yahoo.com, Brian O'Neill
b...@alumni.brown.edu
Subject:  Re: Compund/Composite column names

Feel free to open a ticket with steps to reproduce. We can certainly throw a
more meaningful exception.


On Mon, Dec 17, 2012 at 4:11 PM, Edward Capriolo edlinuxg...@gmail.com
wrote:
 This was discussed in one of the tickets. The problem is that CQL3's sparse
 tables is it has different metadata that has NOT been added to thrift's
 CFMetaData. Thus thrift is unaware of exactly how to verify the insert.
 
 Originally it was made impossible for thrift to see a sparse table (but
 that restriction has been lifted) it seems. It is probably a bad idea to
 thrift insert into a sparse table until Cassandra does not have two
 distinct sources of meta information.
 
 
 
 
 
 On Mon, Dec 17, 2012 at 9:52 AM, Vivek Mishra vivek.mis...@yahoo.comwrote:
 
  Looks like Thrift API is not working as expected?
 
  -Vivek
 
 
 
 
  
   From: Brian O'Neill b...@alumni.brown.edu
  To: dev@cassandra.apache.org
  Cc: Vivek Mishra vivek.mis...@yahoo.com
  Sent: Monday, December 17, 2012 8:12 PM
  Subject: Re: Compund/Composite column names
 
  FYI -- I'm still seeing this on 1.2-beta1.
 
  If you create a table via CQL, then insert into it (via Java API) with
  an incorrect number of components.  The insert works, but select *
  from CQL results in a TSocket read error.
 
  I showed this in the webinar last week, just in case people ran into
  it.  It would be great to translate the ArrayIndexOutofBoundsException
  from the server side into something meaningful in cqlsh to help people
  diagnose the problem.  (a regular user probably doesn't have access to
  the server-side logs)
 
  You can see it at minute 41 in the video from the webinar:
  http://www.youtube.com/watch?v=AdfugJxfd0ofeature=youtu.be
 
  -brian
 
 
  On Tue, Oct 9, 2012 at 9:39 AM, Jonathan Ellis jbel...@gmail.com wrote:
   Sounds like you're running into the keyspace drop bug.  It's mostly
  fixed
   in 1.1.5 but you might need the latest from 1.1 branch.  1.1.6 will be
   released soon with the final fix.
   On Oct 9, 2012 1:58 AM, Vivek Mishra vivek.mis...@yahoo.com wrote:
  
  
  
   Ok. I am able to understand the problem now. Issue is:
  
   If i create a column family altercations as:
  
  
  
  
 *
 *8
   CREATE TABLE altercations (
  instigator text,
  started_at timestamp,
  ships_destroyed int,
  energy_used float,
  alliance_involvement boolean,
  PRIMARY KEY (instigator,started_at,ships_destroyed)
  );
   /
  INSERT INTO altercations (instigator, started_at, ships_destroyed,
energy_used, alliance_involvement)
VALUES ('Jayne Cobb', '2012-07-23', 2, 4.6,
  'false');
  
  
  
 *
 
  
   It works!
  
   But if i create a column family with compound primary key with 2
  composite
   column as:
  
  
  
  
 *
 
   CREATE TABLE altercations (
  instigator text,
  started_at timestamp,
  ships_destroyed int,
  energy_used float,
  alliance_involvement boolean,
  PRIMARY KEY (instigator,started_at)
  );
  
  
  
  
 *
 
   and Then drop this column family:
  
  
  
  
 *
 
   drop columnfamily altercations

Re: Compund/Composite column names

2012-12-17 Thread Brian O'Neill
FYI -- I'm still seeing this on 1.2-beta1.

If you create a table via CQL, then insert into it (via Java API) with
an incorrect number of components.  The insert works, but select *
from CQL results in a TSocket read error.

I showed this in the webinar last week, just in case people ran into
it.  It would be great to translate the ArrayIndexOutofBoundsException
from the server side into something meaningful in cqlsh to help people
diagnose the problem.  (a regular user probably doesn't have access to
the server-side logs)

You can see it at minute 41 in the video from the webinar:
http://www.youtube.com/watch?v=AdfugJxfd0ofeature=youtu.be

-brian


On Tue, Oct 9, 2012 at 9:39 AM, Jonathan Ellis jbel...@gmail.com wrote:
 Sounds like you're running into the keyspace drop bug.  It's mostly fixed
 in 1.1.5 but you might need the latest from 1.1 branch.  1.1.6 will be
 released soon with the final fix.
 On Oct 9, 2012 1:58 AM, Vivek Mishra vivek.mis...@yahoo.com wrote:



 Ok. I am able to understand the problem now. Issue is:

 If i create a column family altercations as:


 **8
 CREATE TABLE altercations (
instigator text,
started_at timestamp,
ships_destroyed int,
energy_used float,
alliance_involvement boolean,
PRIMARY KEY (instigator,started_at,ships_destroyed)
);
 /
INSERT INTO altercations (instigator, started_at, ships_destroyed,
  energy_used, alliance_involvement)
  VALUES ('Jayne Cobb', '2012-07-23', 2, 4.6, 'false');

 *

 It works!

 But if i create a column family with compound primary key with 2 composite
 column as:


 *
 CREATE TABLE altercations (
instigator text,
started_at timestamp,
ships_destroyed int,
energy_used float,
alliance_involvement boolean,
PRIMARY KEY (instigator,started_at)
);


 *
 and Then drop this column family:


 *
 drop columnfamily altercations;

 *

 and then try to create same one with primary compound key with 3 composite
 column:


 *

 CREATE TABLE altercations (
instigator text,
started_at timestamp,
ships_destroyed int,
energy_used float,
alliance_involvement boolean,
PRIMARY KEY (instigator,started_at,ships_destroyed)
);

 *

 it gives me error: TSocket read 0 bytes

 Rest, as no column family is created, so nothing onwards will work.

 Is this an issue?

 -Vivek


 
  From: Jonathan Ellis jbel...@gmail.com
 To: dev@cassandra.apache.org; Vivek Mishra vivek.mis...@yahoo.com
 Sent: Tuesday, October 9, 2012 9:08 AM
 Subject: Re: Compund/Composite column names

 Works for me on latest 1.1 in cql3 mode.  cql2 mode gives a parse error.

 On Mon, Oct 8, 2012 at 9:18 PM, Vivek Mishra vivek.mis...@yahoo.com
 wrote:
  Hi All,
 
  I am trying to use compound primary key column name and i am referring
 to:
  http://www.datastax.com/dev/blog/whats-new-in-cql-3-0
 
 
  As mentioned on this example, i tried to create a column family
 containing compound primary key (one or more) as:
 
   CREATE TABLE altercations (
 instigator text,
 started_at timestamp,
 ships_destroyed int,
 energy_used float,
 alliance_involvement boolean,
 PRIMARY KEY (instigator,started_at,ships_destroyed)
 );
 
  And i am getting:
 
 
  **
  TSocket read 0 bytes
  cqlsh:testcomp
  **
 
 
  Then followed by insert and select statements giving me following errors:
 
 
 
 
  cqlsh:testcompINSERT INTO altercations (instigator, started_at,
 ships_destroyed,
  ...  energy_used,
 alliance_involvement)
  ...  VALUES ('Jayne Cobb', '2012-07-23',
 2, 4.6, 'false');
  TSocket read 0 

Re: Compund/Composite column names

2012-12-17 Thread Brian O'Neill

Will do.  

---
Brian O'Neill
Lead Architect, Software Development
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42   €
healthmarketscience.com


This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.
 


From:  Sylvain Lebresne sylv...@datastax.com
Date:  Monday, December 17, 2012 10:35 AM
To:  dev@cassandra.apache.org
Cc:  Vivek Mishra vivek.mis...@yahoo.com, Brian O'Neill
b...@alumni.brown.edu
Subject:  Re: Compund/Composite column names

Feel free to open a ticket with steps to reproduce. We can certainly throw a
more meaningful exception.


On Mon, Dec 17, 2012 at 4:11 PM, Edward Capriolo edlinuxg...@gmail.com
wrote:
 This was discussed in one of the tickets. The problem is that CQL3's sparse
 tables is it has different metadata that has NOT been added to thrift's
 CFMetaData. Thus thrift is unaware of exactly how to verify the insert.
 
 Originally it was made impossible for thrift to see a sparse table (but
 that restriction has been lifted) it seems. It is probably a bad idea to
 thrift insert into a sparse table until Cassandra does not have two
 distinct sources of meta information.
 
 
 
 
 
 On Mon, Dec 17, 2012 at 9:52 AM, Vivek Mishra vivek.mis...@yahoo.comwrote:
 
  Looks like Thrift API is not working as expected?
 
  -Vivek
 
 
 
 
  
   From: Brian O'Neill b...@alumni.brown.edu
  To: dev@cassandra.apache.org
  Cc: Vivek Mishra vivek.mis...@yahoo.com
  Sent: Monday, December 17, 2012 8:12 PM
  Subject: Re: Compund/Composite column names
 
  FYI -- I'm still seeing this on 1.2-beta1.
 
  If you create a table via CQL, then insert into it (via Java API) with
  an incorrect number of components.  The insert works, but select *
  from CQL results in a TSocket read error.
 
  I showed this in the webinar last week, just in case people ran into
  it.  It would be great to translate the ArrayIndexOutofBoundsException
  from the server side into something meaningful in cqlsh to help people
  diagnose the problem.  (a regular user probably doesn't have access to
  the server-side logs)
 
  You can see it at minute 41 in the video from the webinar:
  http://www.youtube.com/watch?v=AdfugJxfd0ofeature=youtu.be
 
  -brian
 
 
  On Tue, Oct 9, 2012 at 9:39 AM, Jonathan Ellis jbel...@gmail.com wrote:
   Sounds like you're running into the keyspace drop bug.  It's mostly
  fixed
   in 1.1.5 but you might need the latest from 1.1 branch.  1.1.6 will be
   released soon with the final fix.
   On Oct 9, 2012 1:58 AM, Vivek Mishra vivek.mis...@yahoo.com wrote:
  
  
  
   Ok. I am able to understand the problem now. Issue is:
  
   If i create a column family altercations as:
  
  
  
  
 *
 *8
   CREATE TABLE altercations (
  instigator text,
  started_at timestamp,
  ships_destroyed int,
  energy_used float,
  alliance_involvement boolean,
  PRIMARY KEY (instigator,started_at,ships_destroyed)
  );
   /
  INSERT INTO altercations (instigator, started_at, ships_destroyed,
energy_used, alliance_involvement)
VALUES ('Jayne Cobb', '2012-07-23', 2, 4.6,
  'false');
  
  
  
 *
 
  
   It works!
  
   But if i create a column family with compound primary key with 2
  composite
   column as:
  
  
  
  
 *
 
   CREATE TABLE altercations (
  instigator text,
  started_at timestamp,
  ships_destroyed int,
  energy_used float,
  alliance_involvement boolean,
  PRIMARY KEY (instigator,started_at)
  );
  
  
  
  
 *
 
   and Then drop this column family:
  
  
  
  
 *
 
   drop columnfamily altercations

Re: CQL/CLI Experiments w/ 1.2

2012-12-10 Thread Brian O'Neill

Thanks for the explanation(s).

I'm going to give a Create your first java app for Cassandra webinar on
Wednesday, and I was trying to embrace schema creation in CQL, but didn't
want to have to use CompositeType's right off the bat.  (I'll go with
compact storage)

I think I can explain away the empty row/column, but we should probably
publicize that.  I can see that question coming up on every client/api
user list. (hector, astyanax, etc.)

-brian


---
Brian O'Neill
Lead Architect, Software Development
Health Market Science
The Science of Better Results
2700 Horizon Drive € King of Prussia, PA € 19406
M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42  €
healthmarketscience.com

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or
the person responsible to deliver it to the intended recipient, please
contact the sender at the email above and delete this email and any
attachments and destroy any copies thereof. Any review, retransmission,
dissemination, copying or other use of, or taking any action in reliance
upon, this information by persons or entities other than the intended
recipient is strictly prohibited.
 






On 12/10/12 3:16 AM, Sylvain Lebresne sylv...@datastax.com wrote:

There is some more details in
http://www.datastax.com/dev/blog/thrift-to-cql3 but to answer your
questions:


 Question 1:
 What is the empty column/value?


The technical reasons are here:
https://issues.apache.org/jira/browse/CASSANDRA-4361. But basically, it's
a
CQL3 implementation detail.

Question 2:
 It also appears as though the column names are CompositeType even
 though there is only one component:


Yes, it is the case, and the reason is that this is required to accept
collections (even if you don't use collection initially, not using a
composite means you wouldn't be able to add some later). If you explicitly
don't want a compositeType underneath, you'll need to use the 'WITH
COMPACT
STORAGE' option (in which case you will not be able to use collections
obviously).

--
Sylvain




CQL/CLI Experiments w/ 1.2

2012-12-09 Thread Brian O'Neill
I'm using the following schema and data:
CREATE TABLE children ( childId varchar, firstName varchar, lastName
varchar, timezone varchar, PRIMARY KEY (childId ) );
insert into children (childId, firstName, lastName, timezone) values
('bart.simpson', 'Bart', 'Simpson', 'PST');
insert into children (childId, firstName, lastName, timezone) values
('dennis.menace', 'Dennis', 'Menace', 'PST');

All is well on the CQL side of things, but when I go over into CLI, I
see the following:

[default@northpole] list children;
Using default limit of 100
Using default column limit of 100
---
RowKey: bart.simpson
= (column=, value=, timestamp=1355116106465000)
= (column=firstname, value=42617274, timestamp=1355116106465000)
= (column=lastname, value=53696d70736f6e, timestamp=1355116106465000)
= (column=timezone, value=505354, timestamp=1355116106465000)
---
RowKey: dennis.menace
= (column=, value=, timestamp=1355116106466000)
= (column=firstname, value=44656e6e6973, timestamp=1355116106466000)
= (column=lastname, value=4d656e616365, timestamp=1355116106466000)
= (column=timezone, value=505354, timestamp=1355116106466000)

Question 1:
What is the empty column/value?   I ask because it causes
confusion/issues when accessing it from a Java API. (like Astyanax)
That column and value are in the result set.  Should clients start
ignoring empty column names/values?

Question 2:
It also appears as though the column names are CompositeType even
though there is only one component:  (below is from CLI)
  Columns sorted by:
org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type)

Because of that, I would need to use CompositeTypes in my java app to
insert into the table.
Is there any way to create a table via CQL3 that doesn't force me to
use Composite types in my Java app?
(In CQL2, we could specify comparators, but I don't see that in CQL3)

-brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Re: TSocket read 0 bytes from cqlsh

2012-10-04 Thread Brian O'Neill
Obfuscated slightly

The table is something simliar to:

CREATE TABLE data (
  uid varchar,
  t timestamp,
  foo varchar,
  bar varchar,
  PRIMARY KEY (uid, t, foo, bar)
);

Then I can insert just fine via Astyanax and I can see the row via
cli, but the select statement fails in cqlsh.

The table is fine, when I only interact with it through CQL. I can
insert and select fine, until I insert a row from Asytanax.

If needed, I can probably create a small test for this that I can share.

-brian



On Thu, Oct 4, 2012 at 3:08 PM, Jonathan Ellis jbel...@gmail.com wrote:
 What kind of data did you insert, and what was expected?  Expected
 behavior would be to reject nonconforming data at insert time.

 On Thu, Oct 4, 2012 at 2:04 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 This is probably already on your radar, but we could use a better
 error message from cqlsh when the column key doesn't conform to the
 expected schema...

 I accidentally inserted data using Astyanax that didn't conform to the
 schema.  After that, selects from that table via cqlsh return no
 useful information.
 (CLI shows the data just fine)


 bone@boneill-macbook-wired:~/tools/cassandra- bin/cassandra-cli
 Connected to: Test Cluster on 127.0.0.1/9160
 Welcome to Cassandra CLI version 1.1.5

 Type 'help;' or '?' for help.
 Type 'quit;' or 'exit;' to quit.

 [default@unknown] use cirrus;
 Authenticated to keyspace: cirrus
 [default@cirrus] list data;
 Using default limit of 100
 Using default column limit of 100
 ---
 RowKey: PI7JC8
 = (column=*, value=2014-07-31, timestamp=1349376866686000)
 ---
 RowKey: PI1234
 = (column=*, value=Y, timestamp=1349372660453000)

 2 Rows Returned.
 Elapsed time: 212 msec(s).
 [default@cirrus] quit;
 bone@boneill-macbook-wired:~/tools/cassandra- bin/cqlsh -3
 Connected to Test Cluster at localhost:9160.
 [cqlsh 2.2.0 | Cassandra 1.1.5 | CQL spec 3.0.0 | Thrift protocol 19.32.0]
 Use HELP for help.
 cqlsh use cirrus;
 cqlsh:cirrus select * from data;
 TSocket read 0 bytes
 cqlsh:cirrus

 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)

mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Re: TSocket read 0 bytes from cqlsh

2012-10-04 Thread Brian O'Neill
Here you go...

ERROR 14:57:37,270 Error occurred during processing of message.
java.lang.ArrayIndexOutOfBoundsException: 4
at 
org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:773)
at 
org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:137)
at 
org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:108)
at 
org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121)
at 
org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237)
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542)
at 
org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530)
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
at 
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:680)


On Thu, Oct 4, 2012 at 3:15 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 Obfuscated slightly

 The table is something simliar to:

 CREATE TABLE data (
   uid varchar,
   t timestamp,
   foo varchar,
   bar varchar,
   PRIMARY KEY (uid, t, foo, bar)
 );

 Then I can insert just fine via Astyanax and I can see the row via
 cli, but the select statement fails in cqlsh.

 The table is fine, when I only interact with it through CQL. I can
 insert and select fine, until I insert a row from Asytanax.

 If needed, I can probably create a small test for this that I can share.

 -brian



 On Thu, Oct 4, 2012 at 3:08 PM, Jonathan Ellis jbel...@gmail.com wrote:
 What kind of data did you insert, and what was expected?  Expected
 behavior would be to reject nonconforming data at insert time.

 On Thu, Oct 4, 2012 at 2:04 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 This is probably already on your radar, but we could use a better
 error message from cqlsh when the column key doesn't conform to the
 expected schema...

 I accidentally inserted data using Astyanax that didn't conform to the
 schema.  After that, selects from that table via cqlsh return no
 useful information.
 (CLI shows the data just fine)


 bone@boneill-macbook-wired:~/tools/cassandra- bin/cassandra-cli
 Connected to: Test Cluster on 127.0.0.1/9160
 Welcome to Cassandra CLI version 1.1.5

 Type 'help;' or '?' for help.
 Type 'quit;' or 'exit;' to quit.

 [default@unknown] use cirrus;
 Authenticated to keyspace: cirrus
 [default@cirrus] list data;
 Using default limit of 100
 Using default column limit of 100
 ---
 RowKey: PI7JC8
 = (column=*, value=2014-07-31, timestamp=1349376866686000)
 ---
 RowKey: PI1234
 = (column=*, value=Y, timestamp=1349372660453000)

 2 Rows Returned.
 Elapsed time: 212 msec(s).
 [default@cirrus] quit;
 bone@boneill-macbook-wired:~/tools/cassandra- bin/cqlsh -3
 Connected to Test Cluster at localhost:9160.
 [cqlsh 2.2.0 | Cassandra 1.1.5 | CQL spec 3.0.0 | Thrift protocol 19.32.0]
 Use HELP for help.
 cqlsh use cirrus;
 cqlsh:cirrus select * from data;
 TSocket read 0 bytes
 cqlsh:cirrus

 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)

 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)

mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Re: TSocket read 0 bytes from cqlsh

2012-10-04 Thread Brian O'Neill
From this, I assume I inserted the wrong number of values into the
compound key from Astyanax.  It would be nice to carry this error
across to the CQL client.

-brian

On Thu, Oct 4, 2012 at 3:17 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 Here you go...

 ERROR 14:57:37,270 Error occurred during processing of message.
 java.lang.ArrayIndexOutOfBoundsException: 4
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.process(SelectStatement.java:773)
 at 
 org.apache.cassandra.cql3.statements.SelectStatement.execute(SelectStatement.java:137)
 at 
 org.apache.cassandra.cql3.QueryProcessor.processStatement(QueryProcessor.java:108)
 at 
 org.apache.cassandra.cql3.QueryProcessor.process(QueryProcessor.java:121)
 at 
 org.apache.cassandra.thrift.CassandraServer.execute_cql_query(CassandraServer.java:1237)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3542)
 at 
 org.apache.cassandra.thrift.Cassandra$Processor$execute_cql_query.getResult(Cassandra.java:3530)
 at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:32)
 at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:34)
 at 
 org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:186)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
 at 
 java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
 at java.lang.Thread.run(Thread.java:680)


 On Thu, Oct 4, 2012 at 3:15 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 Obfuscated slightly

 The table is something simliar to:

 CREATE TABLE data (
   uid varchar,
   t timestamp,
   foo varchar,
   bar varchar,
   PRIMARY KEY (uid, t, foo, bar)
 );

 Then I can insert just fine via Astyanax and I can see the row via
 cli, but the select statement fails in cqlsh.

 The table is fine, when I only interact with it through CQL. I can
 insert and select fine, until I insert a row from Asytanax.

 If needed, I can probably create a small test for this that I can share.

 -brian



 On Thu, Oct 4, 2012 at 3:08 PM, Jonathan Ellis jbel...@gmail.com wrote:
 What kind of data did you insert, and what was expected?  Expected
 behavior would be to reject nonconforming data at insert time.

 On Thu, Oct 4, 2012 at 2:04 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 This is probably already on your radar, but we could use a better
 error message from cqlsh when the column key doesn't conform to the
 expected schema...

 I accidentally inserted data using Astyanax that didn't conform to the
 schema.  After that, selects from that table via cqlsh return no
 useful information.
 (CLI shows the data just fine)


 bone@boneill-macbook-wired:~/tools/cassandra- bin/cassandra-cli
 Connected to: Test Cluster on 127.0.0.1/9160
 Welcome to Cassandra CLI version 1.1.5

 Type 'help;' or '?' for help.
 Type 'quit;' or 'exit;' to quit.

 [default@unknown] use cirrus;
 Authenticated to keyspace: cirrus
 [default@cirrus] list data;
 Using default limit of 100
 Using default column limit of 100
 ---
 RowKey: PI7JC8
 = (column=*, value=2014-07-31, timestamp=1349376866686000)
 ---
 RowKey: PI1234
 = (column=*, value=Y, timestamp=1349372660453000)

 2 Rows Returned.
 Elapsed time: 212 msec(s).
 [default@cirrus] quit;
 bone@boneill-macbook-wired:~/tools/cassandra- bin/cqlsh -3
 Connected to Test Cluster at localhost:9160.
 [cqlsh 2.2.0 | Cassandra 1.1.5 | CQL spec 3.0.0 | Thrift protocol 19.32.0]
 Use HELP for help.
 cqlsh use cirrus;
 cqlsh:cirrus select * from data;
 TSocket read 0 bytes
 cqlsh:cirrus

 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)

 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)

 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)

mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Re: TSocket read 0 bytes from cqlsh

2012-10-04 Thread Brian O'Neill
Here you go...

// 
//  IN CQLSH
// 
CREATE KEYSPACE cirrus WITH strategy_class = 'NetworkTopologyStrategy'
AND strategy_options:datacenter1 = '1';

use cirrus;

CREATE TABLE data (
  uid varchar,
  t timestamp,
  foo varchar,
  bar varchar,
  PRIMARY KEY (uid, t, foo)
);


// 
// Then in CLI
// 
use cirrus;
set data['PI7JC8KRF6']['1349110576']='2014-07-31';
list data;

// Note, I intentially didn't supply a value for foo in the primary
key definition.
// Listing works.


// 
// Then in CLI
// 
select * from data;

// The result is...
cqlsh:cirrus select * from data;
TSocket read 0 bytes

On Thu, Oct 4, 2012 at 3:31 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 I was able to reproduce with CLI.  I'll send over the example as soon
 as I can obfuscate it.

 -brian

 On Thu, Oct 4, 2012 at 3:19 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Nothing jumps out at me, varchar should be pretty straightforward.
 Probably going to need a test case.  (Even better if you can repro w/
 cli instead of needing Astyanax.)

 On Thu, Oct 4, 2012 at 2:15 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 Obfuscated slightly

 The table is something simliar to:

 CREATE TABLE data (
   uid varchar,
   t timestamp,
   foo varchar,
   bar varchar,
   PRIMARY KEY (uid, t, foo, bar)
 );

 Then I can insert just fine via Astyanax and I can see the row via
 cli, but the select statement fails in cqlsh.

 The table is fine, when I only interact with it through CQL. I can
 insert and select fine, until I insert a row from Asytanax.

 If needed, I can probably create a small test for this that I can share.

 -brian



 On Thu, Oct 4, 2012 at 3:08 PM, Jonathan Ellis jbel...@gmail.com wrote:
 What kind of data did you insert, and what was expected?  Expected
 behavior would be to reject nonconforming data at insert time.

 On Thu, Oct 4, 2012 at 2:04 PM, Brian O'Neill b...@alumni.brown.edu 
 wrote:
 This is probably already on your radar, but we could use a better
 error message from cqlsh when the column key doesn't conform to the
 expected schema...

 I accidentally inserted data using Astyanax that didn't conform to the
 schema.  After that, selects from that table via cqlsh return no
 useful information.
 (CLI shows the data just fine)


 bone@boneill-macbook-wired:~/tools/cassandra- bin/cassandra-cli
 Connected to: Test Cluster on 127.0.0.1/9160
 Welcome to Cassandra CLI version 1.1.5

 Type 'help;' or '?' for help.
 Type 'quit;' or 'exit;' to quit.

 [default@unknown] use cirrus;
 Authenticated to keyspace: cirrus
 [default@cirrus] list data;
 Using default limit of 100
 Using default column limit of 100
 ---
 RowKey: PI7JC8
 = (column=*, value=2014-07-31, timestamp=1349376866686000)
 ---
 RowKey: PI1234
 = (column=*, value=Y, timestamp=1349372660453000)

 2 Rows Returned.
 Elapsed time: 212 msec(s).
 [default@cirrus] quit;
 bone@boneill-macbook-wired:~/tools/cassandra- bin/cqlsh -3
 Connected to Test Cluster at localhost:9160.
 [cqlsh 2.2.0 | Cassandra 1.1.5 | CQL spec 3.0.0 | Thrift protocol 19.32.0]
 Use HELP for help.
 cqlsh use cirrus;
 cqlsh:cirrus select * from data;
 TSocket read 0 bytes
 cqlsh:cirrus

 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)

 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)

 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)

mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Re: TSocket read 0 bytes from cqlsh

2012-10-04 Thread Brian O'Neill
Perfect. Tnx.

On Thu, Oct 4, 2012 at 3:37 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Oh, I see.  I misunderstood at first.  Yes, the thrift side in 1.1
 doesn't validate cql3 composites.  This should be fixed in 1.2 beta1;
 see 
 https://issues.apache.org/jira/browse/CASSANDRA-4377?focusedCommentId=13436817page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13436817

 On Thu, Oct 4, 2012 at 2:31 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 I was able to reproduce with CLI.  I'll send over the example as soon
 as I can obfuscate it.

 -brian

 On Thu, Oct 4, 2012 at 3:19 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Nothing jumps out at me, varchar should be pretty straightforward.
 Probably going to need a test case.  (Even better if you can repro w/
 cli instead of needing Astyanax.)

 On Thu, Oct 4, 2012 at 2:15 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 Obfuscated slightly

 The table is something simliar to:

 CREATE TABLE data (
   uid varchar,
   t timestamp,
   foo varchar,
   bar varchar,
   PRIMARY KEY (uid, t, foo, bar)
 );

 Then I can insert just fine via Astyanax and I can see the row via
 cli, but the select statement fails in cqlsh.

 The table is fine, when I only interact with it through CQL. I can
 insert and select fine, until I insert a row from Asytanax.

 If needed, I can probably create a small test for this that I can share.

 -brian



 On Thu, Oct 4, 2012 at 3:08 PM, Jonathan Ellis jbel...@gmail.com wrote:
 What kind of data did you insert, and what was expected?  Expected
 behavior would be to reject nonconforming data at insert time.

 On Thu, Oct 4, 2012 at 2:04 PM, Brian O'Neill b...@alumni.brown.edu 
 wrote:
 This is probably already on your radar, but we could use a better
 error message from cqlsh when the column key doesn't conform to the
 expected schema...

 I accidentally inserted data using Astyanax that didn't conform to the
 schema.  After that, selects from that table via cqlsh return no
 useful information.
 (CLI shows the data just fine)


 bone@boneill-macbook-wired:~/tools/cassandra- bin/cassandra-cli
 Connected to: Test Cluster on 127.0.0.1/9160
 Welcome to Cassandra CLI version 1.1.5

 Type 'help;' or '?' for help.
 Type 'quit;' or 'exit;' to quit.

 [default@unknown] use cirrus;
 Authenticated to keyspace: cirrus
 [default@cirrus] list data;
 Using default limit of 100
 Using default column limit of 100
 ---
 RowKey: PI7JC8
 = (column=*, value=2014-07-31, timestamp=1349376866686000)
 ---
 RowKey: PI1234
 = (column=*, value=Y, timestamp=1349372660453000)

 2 Rows Returned.
 Elapsed time: 212 msec(s).
 [default@cirrus] quit;
 bone@boneill-macbook-wired:~/tools/cassandra- bin/cqlsh -3
 Connected to Test Cluster at localhost:9160.
 [cqlsh 2.2.0 | Cassandra 1.1.5 | CQL spec 3.0.0 | Thrift protocol 
 19.32.0]
 Use HELP for help.
 cqlsh use cirrus;
 cqlsh:cirrus select * from data;
 TSocket read 0 bytes
 cqlsh:cirrus

 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)

 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)

 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)

mobile:215.588.6024
blog: http://brianoneill.blogspot.com/
twitter: @boneill42


Re: TSocket read 0 bytes from cqlsh

2012-10-04 Thread Brian O'Neill
Scratch that it can change on a per column basis.

Strange world this Java API vs. CQL.

-brian

On Thu, Oct 4, 2012 at 3:57 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 Actually, I found the underlying issue...

 CQL appends the *name* of the value column into the compound key.

 Using the previous schema:
 insert into data (uid, t, foo, bar) values ('PI7JC8KRF6',
 '1349110576', 'foovalue', 'barvalue')

 list data;
 RowKey: PI7JC8KRF6
 = (column=1970-01-16 09:45:10-0500:foovalue:bar, value=barvalue,
 timestamp=1349380029082000)

 Notice bar is on the end of the column name.

 If you don't have that element represented from the Java API (in this
 case, w/ Astyanax), you end up with misaligned interpretation of the
 compound key.  I'll add an extra element to the composite type in
 Astyanax, which should fix things.  I'll also add this to my blog so
 other people don't get tripped up.

 Any insight into why CQL puts that in column name?
 Where does it store the metadata related to compound key
 interpretation? Wouldn't that be a better place for that since it
 shouldn't change within a table?

 -brian


 On Thu, Oct 4, 2012 at 3:39 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 Perfect. Tnx.

 On Thu, Oct 4, 2012 at 3:37 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Oh, I see.  I misunderstood at first.  Yes, the thrift side in 1.1
 doesn't validate cql3 composites.  This should be fixed in 1.2 beta1;
 see 
 https://issues.apache.org/jira/browse/CASSANDRA-4377?focusedCommentId=13436817page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13436817

 On Thu, Oct 4, 2012 at 2:31 PM, Brian O'Neill b...@alumni.brown.edu wrote:
 I was able to reproduce with CLI.  I'll send over the example as soon
 as I can obfuscate it.

 -brian

 On Thu, Oct 4, 2012 at 3:19 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Nothing jumps out at me, varchar should be pretty straightforward.
 Probably going to need a test case.  (Even better if you can repro w/
 cli instead of needing Astyanax.)

 On Thu, Oct 4, 2012 at 2:15 PM, Brian O'Neill b...@alumni.brown.edu 
 wrote:
 Obfuscated slightly

 The table is something simliar to:

 CREATE TABLE data (
   uid varchar,
   t timestamp,
   foo varchar,
   bar varchar,
   PRIMARY KEY (uid, t, foo, bar)
 );

 Then I can insert just fine via Astyanax and I can see the row via
 cli, but the select statement fails in cqlsh.

 The table is fine, when I only interact with it through CQL. I can
 insert and select fine, until I insert a row from Asytanax.

 If needed, I can probably create a small test for this that I can share.

 -brian



 On Thu, Oct 4, 2012 at 3:08 PM, Jonathan Ellis jbel...@gmail.com wrote:
 What kind of data did you insert, and what was expected?  Expected
 behavior would be to reject nonconforming data at insert time.

 On Thu, Oct 4, 2012 at 2:04 PM, Brian O'Neill b...@alumni.brown.edu 
 wrote:
 This is probably already on your radar, but we could use a better
 error message from cqlsh when the column key doesn't conform to the
 expected schema...

 I accidentally inserted data using Astyanax that didn't conform to the
 schema.  After that, selects from that table via cqlsh return no
 useful information.
 (CLI shows the data just fine)


 bone@boneill-macbook-wired:~/tools/cassandra- bin/cassandra-cli
 Connected to: Test Cluster on 127.0.0.1/9160
 Welcome to Cassandra CLI version 1.1.5

 Type 'help;' or '?' for help.
 Type 'quit;' or 'exit;' to quit.

 [default@unknown] use cirrus;
 Authenticated to keyspace: cirrus
 [default@cirrus] list data;
 Using default limit of 100
 Using default column limit of 100
 ---
 RowKey: PI7JC8
 = (column=*, value=2014-07-31, timestamp=1349376866686000)
 ---
 RowKey: PI1234
 = (column=*, value=Y, timestamp=1349372660453000)

 2 Rows Returned.
 Elapsed time: 212 msec(s).
 [default@cirrus] quit;
 bone@boneill-macbook-wired:~/tools/cassandra- bin/cqlsh -3
 Connected to Test Cluster at localhost:9160.
 [cqlsh 2.2.0 | Cassandra 1.1.5 | CQL spec 3.0.0 | Thrift protocol 
 19.32.0]
 Use HELP for help.
 cqlsh use cirrus;
 cqlsh:cirrus select * from data;
 TSocket read 0 bytes
 cqlsh:cirrus

 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)

 mobile:215.588.6024
 blog: http://brianoneill.blogspot.com/
 twitter: @boneill42



 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com



 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)

 mobile:215.588.6024
 blog: http

Re: Server Side Logic/Script - Triggers / StoreProc

2012-04-22 Thread Brian O'Neill
Praveen,

We are certainly interested. To get things moving we implemented an add-on for 
Cassandra to demonstrate the viability (using AOP):
https://github.com/hmsonline/cassandra-triggers

Right now the implementation executes triggers asynchronously, allowing you to 
implement a java interface and plugin your own java class that will get called 
for every insert.

Per the discussion on 1311, we intend to extend our proof of concept to be able 
to invoke scripts as well.  (minimally we'll enable javascript, but we'll 
probably allow for ruby and groovy as well)

-brian

On Apr 22, 2012, at 12:23 PM, Praveen Baratam wrote:

 I found that Triggers are coming in Cassandra 1.2 
 (https://issues.apache.org/jira/browse/CASSANDRA-1311) but no mention of any 
 StoreProc like pattern.
 
 I know this has been discussed so many times but never met with any 
 initiative. Even Groovy was staged out of the trunk.
 
 Cassandra is great for logging and as such will be infinitely more useful if 
 some logic can be pushed into the Cassandra cluster nearer to the location of 
 Data to generate a materialized view useful for applications.
 
 Server Side Scripts/Routines in Distributed Databases could soon prove to be 
 the differentiating factor.
 
 Let me reiterate things with a use case.
 
 In our application we store time series data in wide rows with TTL set on 
 each point to prevent data from growing beyond acceptable limits. Still the 
 data size can be a limiting factor to move all of it from the cluster node to 
 the querying node and then to the application via thrift for processing and 
 presentation.
 
 Ideally we should process the data on the residing node and pass only the 
 materialized view of the data upstream. This should be trivial if Cassandra 
 implements some sort of server side scripting and CQL semantics to call it.
 
 Is anybody else interested in a similar feature? Is it being worked on? Are 
 there any alternative strategies to this problem?
 
 Praveen
 
 

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/



kudos...

2012-04-02 Thread Brian O'Neill
I just wanted to let you guys know that I gave you a shout out...
http://brianoneill.blogspot.com/2012/04/cassandra-vs-couchdb-mongodb-riak-hbase.html

thanks for all the support,
brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Re: Document storage

2012-03-30 Thread Brian O'Neill

Do we also need to consider the client API?
If we don't adjust thrift, the client just gets bytes right?
The client is on their own to marshal back into a structure.  In this
case, it seems like we would want to chose a standard that is efficient
and for which there are common libraries.  Protobuf seems to fit the bill
here.  

Or do we pass back some other structure?  (Native lists/maps? JSON
strings?)

Do we ignore sorting/comparators?
(similar to SOLR, I'm not sure people have defined a good sort for
multi-valued items)

-brian

 
Brian O'Neill
Lead Architect, Software Development
Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406
p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/



On 3/30/12 12:01 PM, Daniel Doubleday daniel.double...@gmx.net wrote:

 Just telling C* to store a byte[] *will* be slightly lighter-weight
 than giving it named columns, but we're talking negligible compared to
 the overhead of actually moving the data on or off disk in the first
 place. 
Hm - but isn't this exactly the point? You don't want to move data off
disk.
But decomposing into columns will lead to more of that:

- Total amount of serialized data is (in most cases a lot) larger than
protobuffed / compressed version
- If you do selective updates the document will be scattered over
multiple ssts plus if you do sliced reads you can't optimize reads as
opposed to the single column version that when updated is automatically
superseding older versions so most reads will hit only one sst

All these reads make the hot dataset. If it fits the page cache your
fine. If it doesn't you need to buy more iron.

Really could not resist because your statement seems to be contrary to
all our tests / learnings.

Cheers,
Daniel

From dev list:

Re: Document storage
On Thu, Mar 29, 2012 at 1:11 PM, Drew Kutcharian d...@venarc.com wrote:
 I think this is a much better approach because that gives you the
 ability to update or retrieve just parts of objects efficiently,
 rather than making column values just blobs with a bunch of special
 case logic to introspect them.  Which feels like a big step backwards
 to me.

 Unless your access pattern involves reading/writing the whole document
each time. In
that case you're better off serializing the whole document and storing it
in a column as a
byte[] without incurring the overhead of column indexes. Right?

Hmm, not sure what you're thinking of there.

If you mean the index that's part of the row header for random
access within a row, then no, serializing to byte[] doesn't save you
anything.

If you mean secondary indexes, don't declare any if you don't want any. :)

Just telling C* to store a byte[] *will* be slightly lighter-weight
than giving it named columns, but we're talking negligible compared to
the overhead of actually moving the data on or off disk in the first
place.  Not even close to being worth giving up being able to deal
with your data from standard tools like cqlsh, IMO.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com





Re: Document storage

2012-03-29 Thread Brian O'Neill
Jonathan, 

I was actually going to take this up with Nate McCall a few weeks back.  I
think it might make sense to get the client development community together
(Netflix w/ Astyanax, Hector, Pycassa, Virgil, etc.)

I agree whole-heartedly that it shouldn't go into the database for all the
reasons you point out.

If we can all decide on some standards for data storage (e.g. composite
types), indexing strategies, etc.  We can provide higher-level functions
through the client libraries and also provide interoperability between
them.  (without bloating Cassandra)

CCing Nate.  Nate, thoughts?
I wouldn't mind coordinating/facilitating the conversation.  If we know
who should be involved.

-brian

 
Brian O'Neill
Lead Architect, Software Development
Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406
p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/







On 3/29/12 3:06 PM, Ben McCann b...@benmccann.com wrote:

Jonathan, I asked Brian about his REST
APIhttps://groups.google.com/forum/?fromgroups#!topic/virgil-users/oncBas
9C8Usand
he said he does not take the json objects and split them because the
client libraries do not agree on implementations.  This was exactly my
concern as well with this solution.  I would be perfectly happy to do it
this way instead of using JSON if it were standardized.  The reason I
suggested JSON is that it is standardized.  As far as I can tell,
Cassandra
doesn't support maps and lists in a standardized way today, which is the
root of my problem.

-Ben


On Thu, Mar 29, 2012 at 11:30 AM, Drew Kutcharian d...@venarc.com wrote:

 Yes, I meant the row header index. What I have done is that I'm
storing
 an object (i.e. UserProfile) where you read or write it as a whole (a
user
 updates their user details in a single page in the UI). So I serialize
that
 object into a binary JSON using SMILE format. I then compress it using
 Snappy on the client side. So as far as Cassandra cares it's storing a
 byte[].

 Now on the client side, I'm using cassandra-cli with a custom type that
 knows how to turn a byte[] into a JSON text and back. The only issue was
 CASSANDRA-4081 where assume doesn't work with custom types. If
 CASSANDRA-4081 gets fixed, I'll get the best of both worlds.

 Also advantages of this vs. the thrift based Super Column families are:

 1. Saving extra CPU usage on the Cassandra nodes. Since
 serialize/deserialize and compression/decompression happens on the
client
 nodes where there is plenty idle CPU time

 2. Saving network bandwidth since I'm sending over a compressed byte[]


 -- Drew



 On Mar 29, 2012, at 11:16 AM, Jonathan Ellis wrote:

  On Thu, Mar 29, 2012 at 1:11 PM, Drew Kutcharian d...@venarc.com
 wrote:
  I think this is a much better approach because that gives you the
  ability to update or retrieve just parts of objects efficiently,
  rather than making column values just blobs with a bunch of special
  case logic to introspect them.  Which feels like a big step
backwards
  to me.
 
  Unless your access pattern involves reading/writing the whole
document
 each time. In that case you're better off serializing the whole document
 and storing it in a column as a byte[] without incurring the overhead of
 column indexes. Right?
 
  Hmm, not sure what you're thinking of there.
 
  If you mean the index that's part of the row header for random
  access within a row, then no, serializing to byte[] doesn't save you
  anything.
 
  If you mean secondary indexes, don't declare any if you don't want
any.
 :)
 
  Just telling C* to store a byte[] *will* be slightly lighter-weight
  than giving it named columns, but we're talking negligible compared to
  the overhead of actually moving the data on or off disk in the first
  place.  Not even close to being worth giving up being able to deal
  with your data from standard tools like cqlsh, IMO.
 
  --
  Jonathan Ellis
  Project Chair, Apache Cassandra
  co-founder of DataStax, the source for professional Cassandra support
  http://www.datastax.com






Re: Document storage

2012-03-29 Thread Brian O'Neill

Jonathan,

We store JSON as our column values.  I'd love to see support for maps and
lists.  If I get some time this weekend, I'll take a look to see what is
required.  I doesn't seem like it would be that hard.

-brian

 
Brian O'Neill
Lead Architect, Software Development
Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406
p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/







On 3/29/12 3:18 PM, Jonathan Ellis jbel...@gmail.com wrote:

On Thu, Mar 29, 2012 at 2:06 PM, Ben McCann b...@benmccann.com wrote:
 As far as I can tell, Cassandra
 doesn't support maps and lists in a standardized way today, which is the
 root of my problem.

I'm pretty serious about adding those for 1.2, for what that's worth.
(If you want to jump in and help code that up, so much the better.)

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com




Re: OoM querying very wide-row in CLI

2012-03-28 Thread Brian O'Neill
Sorry, I didn't realize we weren't hip to pulls yet.

I created a JIRA and attached the patch.
https://issues.apache.org/jira/browse/CASSANDRA-4098

-brian

On Tue, Mar 27, 2012 at 10:42 PM, Brian O'Neill b...@alumni.brown.eduwrote:

 Here she is:
 https://github.com/apache/cassandra/pull/8

 Verified functionally with the attached data script.

 -brian



 On Tue, Mar 27, 2012 at 9:49 PM, Brian O'Neill b...@alumni.brown.eduwrote:

 10-4.  I'll see if I can track it down and submit a pull request that
 specifies a default if one does not exist.

 -brian

 
 Brian O'Neill
 Lead Architect, Software Development
 Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406
 p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/
 blog: http://brianoneill.blogspot.com/







 On 3/27/12 9:45 PM, Jonathan Ellis jbel...@gmail.com wrote:

 I believe we added support for specifying a column range to the cli
 recently.  I don't know if there is a default limit.
 
 On Tue, Mar 27, 2012 at 8:40 PM, Brian O'Neill b...@alumni.brown.edu
 wrote:
  Today, running 1.0.7, we saw a node crash with an OutOfMemory.
  We have a single row with ~10million columns in it. (using it as an
 index)
  Accidentally, we attempted to list the CF in CLI that had the wide-row.
   This caused the CLI to hang and then eventually crashed Cassandra with
 an
  OoM.
 
  I know this is a case of If it hurts when you do that, don't do that,
 but
  we may want to better protect against it in the CLI and/or the DB.  I
 know
  we limit row counts on lists in CLI.  Do we also limit column counts?
 If
  not, I don't mind submitting a patch for this.
 
  let me know,
  brian
 
  --
  Brian ONeill
  Lead Architect, Health Market Science (http://healthmarketscience.com)
  mobile:215.588.6024
  blog: http://weblogs.java.net/blog/boneill42/
  blog: http://brianoneill.blogspot.com/
 
 
 
 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com





 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://weblogs.java.net/blog/boneill42/
 blog: http://brianoneill.blogspot.com/




-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


OoM querying very wide-row in CLI

2012-03-27 Thread Brian O'Neill
Today, running 1.0.7, we saw a node crash with an OutOfMemory.
We have a single row with ~10million columns in it. (using it as an index)
Accidentally, we attempted to list the CF in CLI that had the wide-row.
 This caused the CLI to hang and then eventually crashed Cassandra with an
OoM.

I know this is a case of If it hurts when you do that, don't do that, but
we may want to better protect against it in the CLI and/or the DB.  I know
we limit row counts on lists in CLI.  Do we also limit column counts?  If
not, I don't mind submitting a patch for this.

let me know,
brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Re: OoM querying very wide-row in CLI

2012-03-27 Thread Brian O'Neill
10-4.  I'll see if I can track it down and submit a pull request that
specifies a default if one does not exist.

-brian

 
Brian O'Neill
Lead Architect, Software Development
Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406
p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/







On 3/27/12 9:45 PM, Jonathan Ellis jbel...@gmail.com wrote:

I believe we added support for specifying a column range to the cli
recently.  I don't know if there is a default limit.

On Tue, Mar 27, 2012 at 8:40 PM, Brian O'Neill b...@alumni.brown.edu
wrote:
 Today, running 1.0.7, we saw a node crash with an OutOfMemory.
 We have a single row with ~10million columns in it. (using it as an
index)
 Accidentally, we attempted to list the CF in CLI that had the wide-row.
  This caused the CLI to hang and then eventually crashed Cassandra with
an
 OoM.

 I know this is a case of If it hurts when you do that, don't do that,
but
 we may want to better protect against it in the CLI and/or the DB.  I
know
 we limit row counts on lists in CLI.  Do we also limit column counts?
If
 not, I don't mind submitting a patch for this.

 let me know,
 brian

 --
 Brian ONeill
 Lead Architect, Health Market Science (http://healthmarketscience.com)
 mobile:215.588.6024
 blog: http://weblogs.java.net/blog/boneill42/
 blog: http://brianoneill.blogspot.com/



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com




Re: OoM querying very wide-row in CLI

2012-03-27 Thread Brian O'Neill
Here she is:
https://github.com/apache/cassandra/pull/8

Verified functionally with the attached data script.

-brian



On Tue, Mar 27, 2012 at 9:49 PM, Brian O'Neill b...@alumni.brown.eduwrote:

 10-4.  I'll see if I can track it down and submit a pull request that
 specifies a default if one does not exist.

 -brian

 
 Brian O'Neill
 Lead Architect, Software Development
 Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406
 p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/
 blog: http://brianoneill.blogspot.com/







 On 3/27/12 9:45 PM, Jonathan Ellis jbel...@gmail.com wrote:

 I believe we added support for specifying a column range to the cli
 recently.  I don't know if there is a default limit.
 
 On Tue, Mar 27, 2012 at 8:40 PM, Brian O'Neill b...@alumni.brown.edu
 wrote:
  Today, running 1.0.7, we saw a node crash with an OutOfMemory.
  We have a single row with ~10million columns in it. (using it as an
 index)
  Accidentally, we attempted to list the CF in CLI that had the wide-row.
   This caused the CLI to hang and then eventually crashed Cassandra with
 an
  OoM.
 
  I know this is a case of If it hurts when you do that, don't do that,
 but
  we may want to better protect against it in the CLI and/or the DB.  I
 know
  we limit row counts on lists in CLI.  Do we also limit column counts?
 If
  not, I don't mind submitting a patch for this.
 
  let me know,
  brian
 
  --
  Brian ONeill
  Lead Architect, Health Market Science (http://healthmarketscience.com)
  mobile:215.588.6024
  blog: http://weblogs.java.net/blog/boneill42/
  blog: http://brianoneill.blogspot.com/
 
 
 
 --
 Jonathan Ellis
 Project Chair, Apache Cassandra
 co-founder of DataStax, the source for professional Cassandra support
 http://www.datastax.com





-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Triggers?

2012-01-20 Thread Brian O'Neill
I just posted to the user list, but figured I would post here as well.

We had a big session today designing application-level triggers using a new
column family as a distributed commit log.
When I got back to my desk, I re-googled Cassandra triggers, and re-read:
https://issues.apache.org/jira/browse/CASSANDRA-1311

We had planned to implement something similar to the crack smoking
concept...
Keeping a separate column family that logged the mutation, which a trigger
could then act on and write-back upon success.

Conceptually, this doesn't seem too difficult to implement.  Is anyone
working on this already?
If not, is it worth working it and contributing as a patch?
Or should we just keep it to our app layer?

-brian


-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


FYI -- BufferOverflowException out of CommitLog on trunk

2011-12-12 Thread Brian O'Neill
I haven't had time to look into it yet, but just wanted to let you guys
know that I hit this in case someone was in that code.

ERROR 14:07:31,215 Fatal exception in thread
Thread[COMMIT-LOG-WRITER,5,main]
java.nio.BufferOverflowException
at java.nio.Buffer.nextPutIndex(Buffer.java:501)
at java.nio.DirectByteBuffer.putInt(DirectByteBuffer.java:654)
at
org.apache.cassandra.db.commitlog.CommitLogSegment.write(CommitLogSegment.java:259)
at
org.apache.cassandra.db.commitlog.CommitLog$LogRecordAdder.run(CommitLog.java:568)
at
org.apache.cassandra.db.commitlog.PeriodicCommitLogExecutorService$1.runMayThrow(PeriodicCommitLogExecutorService.java:49)
at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
at java.lang.Thread.run(Thread.java:662)
 INFO 14:07:31,504 flushing high-traffic column family CFS(Keyspace='***',
ColumnFamily='***') (estimated 103394287 bytes)

It happened during a fairly standard load process using M/R.

After that, the server refused to come down with a standard kill.

-brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Re: How is Cassandra being used?

2011-11-16 Thread Brian O'Neill
Lively thread...

+1 opt-in
+1 in separate module

I'll just substantiate Rick Shaw's comments.  If this is on by default, I
can see it making its way into production at a large corporation, at which
time the traffic would sound an alarm as suspicious activity, which would
immediately get the server's plug pulled and trigger an investigation.
 That would land the architect responsible for deploying that server in the
proverbial principal's office.  In the extreme case, that might
black-list the technology and add fuel to any debate that the corporation
should just stick with the 'proven enterprise' solutions.  That is not my
perspective, just be aware that in some large corporations it is an uphill
battle to deploy Cassandra  in the first place given incumbent systems.

In every situation I've been in, even outside of large corporations, we
would need to disable this feature given the sensitivity of the data.

All that said... I would love to see this data. ;)
I'd love to know where our deployment lies on the spectrum of use.

Maybe a good old fashioned web form that allows companies to submit their
usage scenarios might accomplish the same goal? (and you could get
additional context information about the industry, etc.)  It wouldn't be
comprehensive, but it may be sufficiently representative.  Maybe you could
just output a couple lines at server start that said something like Go
here http://... to see how your usage compares to others.

I personally wouldn't throw to big a hissy if it was incorporated into the
actual server and on by default, but I certainly know others that would.

-brian


On Wed, Nov 16, 2011 at 7:17 AM, Eric Evans eev...@acunu.com wrote:

 On Wed, Nov 16, 2011 at 2:01 AM, Jonathan Ellis jbel...@gmail.com wrote:
  On Tue, Nov 15, 2011 at 7:02 PM, Eric Evans eev...@acunu.com wrote:
  I think this is potentially quite dangerous; There are a lot people
  who get very twitchy at the idea of software that Phones Home.  I've
  seen this so many times, and in all cases it was for software a lot
  less sensitive than a database.
 
  True, but unlike most Home Phoners, ours will be out there in the open
  and you can see exactly what it's sending (or not, if you disable it).
   I'm sure there's other examples in the wild of this, but the only one
  I can think of is popcorn [1].

 I don't think the transparency of the implementation changes things
 much.  It's still going to be opaque to a lot of folks, and more
 importantly is the precedence it sets and the way it changes the
 project/user trust relationship.

 Even if you're satisfied with the implementation, and trust that it
 won't be extended to transmit additional data later (unintentionally
 or otherwise), there are still very valid privacy concerns.  For
 example, seeing as how this must be transmitted over an IP network,
 there are only so many guarantees you can make with respect to
 anonymity.  There will always be *someone* that can tie the data to a
 unique IP, and an IP can almost always be tied to an individual or
 organization.  Imagine an organization that doesn't want *anyone* to
 know it uses Cassandra, and isn't willing to accept the risk that one
 of their admins might accidentally enable this reporting.

 It's also interesting that you mention popcon because it has always
 been contentious.  It's taken years for it to transition from the
 point where it required users to install it themselves, to a prompt at
 install-time that defaulted to No, to the current state of an
 install-time prompt that defaults to Yes.  And, the installer asks
 *very* few questions; Whether or not popcon is enabled is on par with
 partitioning and the assignment of a root password.

 Also, there should be no shame in the admission that we haven't earned
 anywhere near the level of trust and respect that the Debian project
 has.

  More broadly, my sense is that people are getting used to the idea
  that it's okay to give away anonymous statistics as part of the price
  of free, although YMMclearlyV. I am, after all, a Windows user. :)

 As privacy becomes more threatened people are either capitulating, or
 becoming even more defensive; Whether that makes it better or worse
 for us if we do this is debatable.

  I'm sure you've already considered this though, you're already talking
  about anonymity, and transparency, and what I assume is neutrality of
  the collection endpoint (can apache actually provide a VM; is that a
  thing?).
 
  Yes, they provide Ubuntu or FreeBSD VMs.
 
  I'm just afraid that we'll scare people off before they can
  be properly convinced that it's all on the up-and-up.
 
  How would you propose addressing this?

 Honestly?  The best way to convince people that we take the privacy of
 their data seriously is to not transmit any of it to a machine outside
 their control.

  I'm curious to see what others think, but at the moment I'm hovering
  somewhere around a -0 if it were opt-in (off by default).
 
  I'm okay with 

AOP for SOLR Integration with Cassandra

2011-11-04 Thread Brian O'Neill
I just sent an email out over the users list.

Over a couple nights this week, I added SOLR integration into Virgil.
(Virgil is that REST layer that we've been building out over in Apache
Extras)

I just wanted to through an idea out to the dev list...
I plan to migrate the current implementation in Virgil to use AOP.  That
will provide a good separation of concerns between Cassandra Storage and
the SOLR indexing.  Doing it with AOP will also allow us to move it into
the main codebase if/when we want to.  We would simply move the AOP to
surround CassandraServer. (or lower... even down into Storage)

Let me know if you think that is worth exploring further.

-brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Contribution: Native REST Layer for Cassandra

2011-10-18 Thread Brian O'Neill
Jeremy/Jonathan,

When you finish celebrating the 1.0 release, I just submitted a native rest
layer for Cassandra.
https://issues.apache.org/jira/browse/CASSANDRA-3380

It uses JAX-RS and Apache CXF supporting the following operations (JSON over
HTTP):


   - Create keyspace
   - Drop keyspace
   - Create column family
   - Drop column family
   - Insert row
   - Fetch row
   - Delete row
   - Insert column
   - Fetch column
   - Delete column

This is a new module under contrib/rest. It builds using ant and ivy.  I
also included a maven pom.xml file that makes it easier to get setup in
Eclipse for those that use m2eclipse.  You start the server with
bin/rest_cassandra.  After that, you can issue all commands over HTTP on
port 8080.  I included example curl commands in the README.txt.  There are
junit tests that provide good code coverage of the JSON marshalling, the
system and data operations as well as the REST layer.

Let me know if you have any trouble building / using it.  In the meantime,
I'll start work on some additional todo's. Specifically we should add:
- Better exception handling
- Host/Port configuration
- Security
- XML support
- Binary object / Byte support (assumes String's right now)

(kudos to Gary Dusbabek for the initial thought to implement this as a
native layer)

all the best,
brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Eclipse style/formatting file?

2011-10-13 Thread Brian O'Neill
All,

Anyone have an eclipse style/formatting file compatible with the Cassandra
code?

I don't see one here:
http://wiki.apache.org/cassandra/RunningCassandraInEclipse

(I'm trying to get the REST API in a good state for contribution)

thanks,
brian

-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Re: Eclipse style/formatting file?

2011-10-13 Thread Brian O'Neill
Perfect. Thanks.

-brian

On Thu, Oct 13, 2011 at 11:13 AM, Feiyi Wang fwa...@gmail.com wrote:

 How about this?
 https://github.com/tjake/cassandra-style-eclipse

 Feiyi


 On Thu, Oct 13, 2011 at 10:58 AM, Brian O'Neill b...@alumni.brown.edu
 wrote:

  All,
 
  Anyone have an eclipse style/formatting file compatible with the
 Cassandra
  code?
 
  I don't see one here:
  http://wiki.apache.org/cassandra/RunningCassandraInEclipse
 
  (I'm trying to get the REST API in a good state for contribution)
 
  thanks,
  brian
 
  --
  Brian ONeill
  Lead Architect, Health Market Science (http://healthmarketscience.com)
  mobile:215.588.6024
  blog: http://weblogs.java.net/blog/boneill42/
  blog: http://brianoneill.blogspot.com/
 




-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Re: REST API?

2011-10-11 Thread Brian O'Neill
To give everyone an update...

I was able to take what Gary had and update it to run on trunk.
I like the native integration, as opposed to layering it on top of Hector.
 It's working out well.
I layered in JAX-RS to replace the hand parsing of the url, and the
handlers.
I have reads and writes working through the StorageProxy, but I think I'm
going to raise it up one layer to take advantage of ThriftValidations.
(but still using direct method invocation instead of the thift client)
I added unit tests for the read/write of columns.

I'm going to add a few other operations (add/drop keyspace, add/drop CF).
Then it should be in a state where I can share it.

-brian



On Mon, Oct 10, 2011 at 10:06 PM, Jeremy Hanna
jeremy.hanna1...@gmail.comwrote:

 Brian,

 If you end up doing something with the rest api and making it
 available/open source, please post again either here or on the user list.  I
 think others would be interested and may contribute to it.

 Cheers,

 Jeremy

 On Oct 10, 2011, at 8:42 PM, Brian O'Neill wrote:

  Thanks Gary. Perfect.  Checking it out now.
 
  Performance isn't much of a concern for us through the REST interface.
  We
  are using the Hadoop/PIG integration to do the heavy lifting.  This will
 be
  mostly for reads and small number of writes.
 
  I'll definitely give this a try.  Thanks again.  I'll let you know how it
  turns out.
 
  -brian
 
  On Mon, Oct 10, 2011 at 9:35 PM, Gary Dusbabek gdusba...@gmail.com
 wrote:
 
  It turns out that it is pretty easy (or it was a year ago) to replace
  the native Cassandra transport with your own.  I wrote about it on my
  blog (http://www.onemanclapping.org/2010/09/restful-cassandra.html),
  using REST as an example.
 
 
  On Mon, Oct 10, 2011 at 20:12, Brian O'Neill b...@alumni.brown.edu
  wrote:
  My team desperately needs a REST API for Cassandra.
 
  I saw the following:
  http://code.google.com/p/restish/
  from
 
 
 http://crlog.info/2011/01/29/restish-wrapper-for-hectorcassandra-data-manipulation/
 
  But it appears to have little activity and documentation.
 
  That lead me to start work on a contrib/rest module, but before I get
 to
  far
  I wanted to ask if there was any effort underway for a REST Server/API.
  If not, I'll continue developing the REST server.  Any preference for a
  REST
  stack?  (JAX-RS on Apache-CXF?  Raw Servlets? Netty? etc.)
 
  Until I hear back, I'll continue with the JAX-RS / Apache CXF
  implementation
  I have cooking.
 
  -brian
 
  --
  Brian ONeill
  Lead Architect, Health Market Science (http://healthmarketscience.com)
  mobile:215.588.6024
  blog: http://weblogs.java.net/blog/boneill42/
  blog: http://brianoneill.blogspot.com/
 
 
 
 
 
  --
  Brian ONeill
  Lead Architect, Health Market Science (http://healthmarketscience.com)
  mobile:215.588.6024
  blog: http://weblogs.java.net/blog/boneill42/
  blog: http://brianoneill.blogspot.com/




-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Patch for Contrib/Pig to accommodate refactoring of hexToBytes

2011-10-10 Thread Brian O'Neill
Jonathan,

We need a small update to contrib/pig to accommodate pulling hexToBytes out
of FBUtilities into Hex.
I raised an issue, and attached is the patch for trunk.

https://issues.apache.org/jira/browse/CASSANDRA-3341

-brian


-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/
Index: src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java
===
--- src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java  
(revision 1181048)
+++ src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java  
(working copy)
@@ -26,7 +26,7 @@
 import org.apache.cassandra.db.marshal.IntegerType;
 import org.apache.cassandra.db.marshal.TypeParser;
 import org.apache.cassandra.thrift.*;
-import org.apache.cassandra.utils.FBUtilities;
+import org.apache.cassandra.utils.Hex;
 import org.apache.commons.logging.Log;
 import org.apache.commons.logging.LogFactory;
 
@@ -601,7 +601,7 @@
 TSerializer serializer = new TSerializer(new 
TBinaryProtocol.Factory());
 try
 {
-return FBUtilities.bytesToHex(serializer.serialize(cfDef));
+return Hex.bytesToHex(serializer.serialize(cfDef));
 }
 catch (TException e)
 {
@@ -616,7 +616,7 @@
 CfDef cfDef = new CfDef();
 try
 {
-deserializer.deserialize(cfDef, FBUtilities.hexToBytes(st));
+deserializer.deserialize(cfDef, Hex.hexToBytes(st));
 }
 catch (TException e)
 {


Re: REST API?

2011-10-10 Thread Brian O'Neill
Thanks Gary. Perfect.  Checking it out now.

Performance isn't much of a concern for us through the REST interface.  We
are using the Hadoop/PIG integration to do the heavy lifting.  This will be
mostly for reads and small number of writes.

I'll definitely give this a try.  Thanks again.  I'll let you know how it
turns out.

-brian

On Mon, Oct 10, 2011 at 9:35 PM, Gary Dusbabek gdusba...@gmail.com wrote:

 It turns out that it is pretty easy (or it was a year ago) to replace
 the native Cassandra transport with your own.  I wrote about it on my
 blog (http://www.onemanclapping.org/2010/09/restful-cassandra.html),
 using REST as an example.


 On Mon, Oct 10, 2011 at 20:12, Brian O'Neill b...@alumni.brown.edu
 wrote:
  My team desperately needs a REST API for Cassandra.
 
  I saw the following:
  http://code.google.com/p/restish/
  from
 
 http://crlog.info/2011/01/29/restish-wrapper-for-hectorcassandra-data-manipulation/
 
  But it appears to have little activity and documentation.
 
  That lead me to start work on a contrib/rest module, but before I get to
 far
  I wanted to ask if there was any effort underway for a REST Server/API.
  If not, I'll continue developing the REST server.  Any preference for a
 REST
  stack?  (JAX-RS on Apache-CXF?  Raw Servlets? Netty? etc.)
 
  Until I hear back, I'll continue with the JAX-RS / Apache CXF
 implementation
  I have cooking.
 
  -brian
 
  --
  Brian ONeill
  Lead Architect, Health Market Science (http://healthmarketscience.com)
  mobile:215.588.6024
  blog: http://weblogs.java.net/blog/boneill42/
  blog: http://brianoneill.blogspot.com/
 




-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/


Re: REST API?

2011-10-10 Thread Brian O'Neill
Will do.   I've picked up where Gary left off.  It is good starting point,
with a good mapping between REST and get/set/mutations. (kudos to Gary)
 I'll update it to accomodate any changes and see if I can add some tests on
top of it.

I may look to add in JAX-RS (on either Jersey or Apache CXF).  We use it for
all of our REST services, and it may provide a good abstraction layer that
we can build on.

Give me a couple days.
I have to get back into the ant mentality. I've been doing maven too long.
BTW -- Does anyone know if there are plans to move to maven?
(Not trying to start a religious war, just curious. ;)

-brian

On Mon, Oct 10, 2011 at 10:06 PM, Jeremy Hanna
jeremy.hanna1...@gmail.comwrote:

 Brian,

 If you end up doing something with the rest api and making it
 available/open source, please post again either here or on the user list.  I
 think others would be interested and may contribute to it.

 Cheers,

 Jeremy

 On Oct 10, 2011, at 8:42 PM, Brian O'Neill wrote:

  Thanks Gary. Perfect.  Checking it out now.
 
  Performance isn't much of a concern for us through the REST interface.
  We
  are using the Hadoop/PIG integration to do the heavy lifting.  This will
 be
  mostly for reads and small number of writes.
 
  I'll definitely give this a try.  Thanks again.  I'll let you know how it
  turns out.
 
  -brian
 
  On Mon, Oct 10, 2011 at 9:35 PM, Gary Dusbabek gdusba...@gmail.com
 wrote:
 
  It turns out that it is pretty easy (or it was a year ago) to replace
  the native Cassandra transport with your own.  I wrote about it on my
  blog (http://www.onemanclapping.org/2010/09/restful-cassandra.html),
  using REST as an example.
 
 
  On Mon, Oct 10, 2011 at 20:12, Brian O'Neill b...@alumni.brown.edu
  wrote:
  My team desperately needs a REST API for Cassandra.
 
  I saw the following:
  http://code.google.com/p/restish/
  from
 
 
 http://crlog.info/2011/01/29/restish-wrapper-for-hectorcassandra-data-manipulation/
 
  But it appears to have little activity and documentation.
 
  That lead me to start work on a contrib/rest module, but before I get
 to
  far
  I wanted to ask if there was any effort underway for a REST Server/API.
  If not, I'll continue developing the REST server.  Any preference for a
  REST
  stack?  (JAX-RS on Apache-CXF?  Raw Servlets? Netty? etc.)
 
  Until I hear back, I'll continue with the JAX-RS / Apache CXF
  implementation
  I have cooking.
 
  -brian
 
  --
  Brian ONeill
  Lead Architect, Health Market Science (http://healthmarketscience.com)
  mobile:215.588.6024
  blog: http://weblogs.java.net/blog/boneill42/
  blog: http://brianoneill.blogspot.com/
 
 
 
 
 
  --
  Brian ONeill
  Lead Architect, Health Market Science (http://healthmarketscience.com)
  mobile:215.588.6024
  blog: http://weblogs.java.net/blog/boneill42/
  blog: http://brianoneill.blogspot.com/




-- 
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog: http://weblogs.java.net/blog/boneill42/
blog: http://brianoneill.blogspot.com/