Re: are there any free Cassandra -> ElasticSearch connector / plugin ?

2016-10-13 Thread Brian O'Neill
I haven't used it yet, but https://github.com/vroyer/elassandra <https://github.com/vroyer/elassandra> -- Brian O'Neill Principal Architect @ Monetate m: 215.588.6024 bone...@monetate.com <mailto:bone...@monetate.com> > On Oct 13, 2016, at 6:02 PM, Eric Ho <e...@anal

Re: Support for ad-hoc query

2015-06-09 Thread Brian O'Neill
Cassandra isn¹t great at ad hoc queries. Many of us have paired it with an indexing engine like SOLR or Elastic Search. (built-into the DSE solution) As of late, I think there are a few of us exploring Spark SQL. (which you can then use via JDBC or REST) -brian --- Brian O'Neill Chief

Re: Spark SQL JDBC Server + DSE

2015-06-03 Thread Brian O'Neill
Kudos Ben. We¹ve been tracking Zeppelin, and considered doing the same thing. You beat us to it. Well done. -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42 This information

Re: Spark SQL JDBC Server + DSE

2015-05-30 Thread Brian O'Neill
-the- thrift-jdbcodbc-server I wouldn¹t mind collaborating on that, if you are headed in that direction. (and then I could write the REST server on top of that) LMK, -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 http

Re: Spark SQL JDBC Server + DSE

2015-05-28 Thread Brian O'Neill
assume you need JDBC connectivity specifically? -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42 This information transmitted in this email message is for the intended recipient only

Re: cassandra and spark from cloudera distirbution

2015-04-22 Thread Brian O'Neill
Depends which veresion of Spark you are running on Cloudera. Once you know that ‹ have a look at the compatibility chart here: https://github.com/datastax/spark-cassandra-connector -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile

Re: Adhoc querying in Cassandra?

2015-04-22 Thread Brian O'Neill
+1, I think many organizations (including ours) pair Elastic Search with Cassandra. Use Cassandra as your system of record, then index the data with ES. -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 http

Re: Adhoc querying in Cassandra?

2015-04-22 Thread Brian O'Neill
Again ‹ agreed. They have different usage patterns (C* heavy writes, ES heavy read), I would separate them. SOLR should be sufficient. I believe DSE is a tight integration between SOLR and C*. -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company

Re: Cassandra - Storm

2015-04-03 Thread Brian O'Neill
I¹d recommend using Storm¹s State abstraction. Check out: https://github.com/hmsonline/storm-cassandra-cql -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 http://www.twitter.com/boneill42 This information

Re: Frequent timeout issues

2015-04-01 Thread Brian O'Neill
Are you using the storm-cassandra-cql driver? (https://github.com/hmsonline/storm-cassandra-cql) If so, what version? Batching or no batching? -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 http://www.twitter.com

Re: cassandra source code

2015-03-24 Thread Brian O'Neill
FWIW ‹ I just went through this, and posted the process I used to get up and running: http://brianoneill.blogspot.com/2015/03/getting-started-with-cassandra.html -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 http

Re: IF NOT EXISTS on UPDATE statements?

2014-11-18 Thread Brian O'Neill
would love to see: UPSERT value=new_value where (not exists || value=read_value) (ignoring some intricacies) -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http

Re: IF NOT EXISTS on UPDATE statements?

2014-11-18 Thread Brian O'Neill
Exactly. Perfect. Will do. Thanks Robert. -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com

Re: [ANN] SparkSQL support for Cassandra with Calliope

2014-10-03 Thread Brian O'Neill
Well done Rohit. (and crew) -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com This information

Re: Cassandra blob storage

2014-03-18 Thread Brian O'Neill
You may want to look at: https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Brian O'Neill
queries using more familiar syntax. (including future things such as joins, grouping, etc.) To me, that is exciting, and again ‹ one of the reasons we are leaning on it. my two cents, brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700

[Blog] : Storm and Cassandra : A Three Year Retrospective

2014-02-13 Thread Brian O'Neill
A community member asked for a blog post on Storm + Cassandra. FWIW, here was our journey. http://brianoneill.blogspot.com/2014/02/storm-and-cassandra-three-year.html -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive

Re: CQL list command

2014-02-07 Thread Brian O'Neill
+1, agreed. I do the same thing. If cli is going away, we¹ll need this ability in cqlsh. I created a JIRA issue for it: https://issues.apache.org/jira/browse/CASSANDRA-6676 We¹ll see what the crew come back with. -brian --- Brian O'Neill Chief Technology Officer Health Market Science

Re: Dimensional SUM, COUNT, DISTINCT in C* (replacing Acunu)

2013-12-18 Thread Brian O'Neill
¹ll continue the discussion on the issue. thanks again, brian --- Brian O'Neill Chief Architect Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com

Dimensional SUM, COUNT, DISTINCT in C* (replacing Acunu)

2013-12-17 Thread Brian O'Neill
We are seeking to replace Acunu in our technology stack / platform. It is the only component in our stack that is not open source. In preparation, over the last few weeks I’ve migrated Virgil to CQL. The vision is that Virgil could receive a REST request to upsert/delete data (hierarchical

Re: Drop keyspace via CQL hanging on master/trunk.

2013-12-10 Thread Brian O'Neill
Great. Thanks Aaron. FWIW, I am/was porting Virgil over CQL. I should be able to release a new REST API for C* (using CQL) shortly. -brian --- Brian O'Neill Chief Architect Health Market Science The Science of Better Results 2700 Horizon Drive • King of Prussia, PA • 19406 M: 215.588.6024

Drop keyspace via CQL hanging on master/trunk.

2013-12-05 Thread Brian O'Neill
describe keyspaces; system test_keyspace system_traces cqlsh drop keyspace test_keyspace; THIS HANGS INDEFINITELY thoughts? user error? worth filing an issue? One other note ‹ this happens using the CQL java driver as well. -brian --- Brian O'Neill Chief Architect Health Market Science

Re: Drop keyspace via CQL hanging on master/trunk.

2013-12-05 Thread Brian O'Neill
I removed the data directory just to make sure I had a clean environment. (eliminating the possibility of corrupt keyspaces/files causing problems) -brian --- Brian O'Neill Chief Architect Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M

Re: Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread Brian O'Neill
Vivek, The location of CassandraDaemon changed between versions. (from org.apache.cassandra.thrift to org.apache.cassandra.service) It is likely that the start scripts are picking up the old version on the classpath, which results in the main method not being found. Do you have CASSANDRA_HOME

Re: Main method not found in class org.apache.cassandra.service.CassandraDaemon

2013-07-17 Thread Brian O'Neill
Vivek, You could try echoing the CLASSPATH to double check. Drop an echo into the launch_service function in the cassandra shell script. (~line 121) Let us know the output. -brian --- Brian O'Neill Chief Architect Health Market Science The Science of Better Results 2700 Horizon Drive € King

SQL Injection C* (via CQL Thrift)

2013-06-18 Thread Brian O'Neill
Mostly for fun, I wanted to throw this out there... We are undergoing a security audit for our platform (C* + Elastic Search + Storm). One component of that audit is susceptibility to SQL injection. I was wondering if anyone has attempted to construct a SQL injection attack against Cassandra?

Re: SQL Injection C* (via CQL Thrift)

2013-06-18 Thread Brian O'Neill
% confident making that assertion. -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com

[BLOG] : Cassandra as a Deep Storage Mechanism for Druid Real-Time Analytics Engine

2013-05-17 Thread Brian O'Neill
FWIW, we were able to integrate Druid and Cassandra. Its only in PoC right now, but it seems like a powerful combination: http://brianoneill.blogspot.com/2013/05/cassandra-as-deep-storage-mechanism-for.html -brian -- Brian ONeill Lead Architect, Health Market Science

Re: multitenant support with key spaces

2013-05-06 Thread Brian O'Neill
You may want to look at using virtual keyspaces: http://hector-client.github.io/hector/build/html/content/virtual_keyspaces.html And follow these tickets: http://wiki.apache.org/cassandra/MultiTenant -brian On May 6, 2013, at 2:37 AM, Darren Smythe wrote: How many keyspaces can you

Re: Exporting all data within a keyspace

2013-04-30 Thread Brian O'Neill
You could always do something like this as well: http://brianoneill.blogspot.com/2012/05/dumping-data-from-cassandra-like.htm l -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M

Re: Blobs in CQL?

2013-04-11 Thread Brian O'Neill
Great! Thanks Gabriel. Do you have an example? (are using QueryBuilder?) I couldn't find the part of the API that allowed you to pass in the byte array. -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King

Re: Blobs in CQL?

2013-04-11 Thread Brian O'Neill
); sb.append(ByteBufferUtil.bytesToHex((ByteBuffer)value)); } Hopefully, the prepared statement doesn't do the conversion. (I'm not sure if it is a limitation of the CQL protocol itself) thanks again, -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science

Re: Blobs in CQL?

2013-04-11 Thread Brian O'Neill
(); LOG.error(repository.get() [ + key + ] byte.length()=[ + data.length + ]); return data; } --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive • King of Prussia, PA • 19406 M: 215.588.6024 • @boneill42 http

Re: Blobs in CQL?

2013-04-11 Thread Brian O'Neill
Sylvain, Interesting, when I look at the actual bytes returned, I see the byte array is prefixed with the keyspace and table name. I assume I'm doing something wrong in the select. Am I incorrectly using the ResultSet? -brian On Thu, Apr 11, 2013 at 9:09 AM, Brian O'Neill b

Re: Blobs in CQL?

2013-04-11 Thread Brian O'Neill
the bb.remaining() bytes starting at bb.arrayOffset() + bb.position() (where bb is the returned ByteBuffer). -- Sylvain -brian On Thu, Apr 11, 2013 at 9:09 AM, Brian O'Neill b...@alumni.brown.eduwrote: Yep, it worked like a charm. (PreparedStatement avoided the hex conversion

Re: Bitmap indexes - reviving CASSANDRA-1472

2013-04-10 Thread Brian O'Neill
changing to user@ (at least until we can determine if this can/should be proposed under 1472) For those interested in analytics and set-based queries, see below... -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon

BI/Analtyics/Warehousing for data in C*

2013-04-01 Thread Brian O'Neill
We are trudging through an options analysis for BI/DW solutions for data stored in C*. I'd love to hear people's experiences. Here is what we've found so far: http://brianoneill.blogspot.com/2013/04/bianalytics-on-big-datacassandra.html Maybe we just use Intravert with a custom handler to

Re: any other NYC* attendees find your usb stick of the proceedings empty?

2013-03-25 Thread Brian O'Neill
/edwardcapriolo/intravert -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com This information transmitted

Re: Netflix/Astynax Client for Cassandra

2013-02-07 Thread Brian O'Neill
Incidentally, we run Astyanax against 1.2.1. We haven't had any issues. When running against 1.2.0, we ran into this: https://github.com/Netflix/astyanax/issues/191 -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon

Re: Accessing Metadata of Column Familes

2013-01-28 Thread Brian O'Neill
Through CQL, you see the logical schema. Through CLI, you see the physical schema. This may help: http://www.datastax.com/dev/blog/cql3-for-cassandra-experts -brian On Mon, Jan 28, 2013 at 7:26 AM, Rishabh Agrawal rishabh.agra...@impetus.co.in wrote: I found following issues while working on

Re: cql: show tables in a keystone

2013-01-28 Thread Brian O'Neill
cqlsh use keyspace; cqlsh:cirrus describe tables; For more info: cqlsh help describe -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http

Webinar: Using Storm for Distributed Processing on Cassandra

2013-01-16 Thread Brian O'Neill
Just an FYI -- We will be hosting a webinar tomorrow demonstrating the use of Storm as a distributed processing layer on top of Cassandra. I'll be tag teaming with Taylor Goetz, the original author of storm-cassandra. http://www.datastax.com/resources/webinars/collegecredit It is part of the

Re: Cassandra 1.2 Thrift and CQL 3 issue

2013-01-12 Thread Brian O'Neill
I reported the issue here. You may be missing a component in your column name. https://issues.apache.org/jira/browse/CASSANDRA-5138 -brian On Jan 12, 2013, at 12:48 PM, Shahryar Sedghi wrote: Hi I am trying to test my application that runs with JDBC, CQL 3 with Cassandra 1.2. After

Re: Astyanax

2013-01-08 Thread Brian O'Neill
-frist-java-application -w.html -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com

Re: Best Java Driver for Cassandra?

2012-12-13 Thread Brian O'Neill
available afterwards. I also have a laundry list here: (written before I knew about Firebrand) http://brianoneill.blogspot.com/2012/08/cassandra-apis-laundry-list.html -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon

Datastax C*ollege Credit Webinar Series : Create your first Java App w/ Cassandra

2012-12-12 Thread Brian O'Neill
FWIW -- I'm presenting tomorrow for the Datastax C*ollege Credit Webinar Series: http://brianoneill.blogspot.com/2012/12/presenting-for-datastax-college-credit.html I hope to make CQL part of the presentation and show how it integrates with the Java APIs. If you are interested, drop in. -brian

Re: Datatype Conversion in CQL-Client?

2012-11-19 Thread Brian O'Neill
I don't think Michael and/or Jonathan have published the CQL java driver yet. (CCing them) Hopefully they'll find a public home for it soon, I hope to include it in the Webinar in December. (http://www.datastax.com/resources/webinars/collegecredit) -brian --- Brian O'Neill Lead Architect

Re: Datatype Conversion in CQL-Client?

2012-11-19 Thread Brian O'Neill
that metadata to the result set in: https://github.com/apache/cassandra/blob/trunk/interface/thrift/gen-java/org /apache/cassandra/thrift/CqlResult.java -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King

Re: Datastax Java Driver

2012-11-19 Thread Brian O'Neill
Woohoo! Thanks for making this available. --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com

Re: Datatype Conversion in CQL-Client?

2012-11-19 Thread Brian O'Neill
-brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com This information transmitted in this email

Re: Datatype Conversion in CQL-Client?

2012-11-18 Thread Brian O'Neill
If you are talking about the CQL-client that comes with Cassandra (cqlsh), it is actually written in Python: https://github.com/apache/cassandra/blob/trunk/bin/cqlsh For information on datatypes (and conversion) take a look at the CQL definition:

Re: [BETA RELEASE] Apache Cassandra 1.2.0-beta2 released

2012-11-10 Thread Brian O'Neill
Wow...good catch. We had puppet scripts which automatically assigned the proper tokens given the cluster size. What is the range now? Got a link? -brian On Nov 10, 2012, at 9:27 PM, Edward Capriolo wrote: just a note for all. The default partitioner is no longer randompartitioner. It is

Indexing Data in Cassandra with Elastic Search

2012-11-08 Thread Brian O'Neill
For those looking to index data in Cassandra with Elastic Search, here is what we decided to do: http://brianoneill.blogspot.com/2012/11/big-data-quadfecta-cassandra-storm.html -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024

Re: logging servers? any interesting in one for cassandra?

2012-11-07 Thread Brian O'Neill
Thanks Dean. We'll definitely take a look. (probably in January) -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive • King of Prussia, PA • 19406 M: 215.588.6024 • @boneill42 http://www.twitter.com/boneill42

Re: logging servers? any interesting in one for cassandra?

2012-11-06 Thread Brian O'Neill
the next few months. -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com This information transmitted

Keeping the record straight for Cassandra Benchmarks...

2012-10-25 Thread Brian O'Neill
People probably saw... http://www.networkworld.com/cgi-bin/mailto/x.cgi?pagetosend=/news/tech/2012/102212-nosql-263595.html To clarify things take a look at... http://brianoneill.blogspot.com/2012/10/solid-nosql-benchmarks-from-ycsb-w-side.html -brian -- Brian ONeill Lead Architect, Health

Re: Using compound primary key

2012-10-08 Thread Brian O'Neill
Hey Vivek, The same thing happened to me the other day. You may be missing a component in your compound key. See this thread: http://mail-archives.apache.org/mod_mbox/cassandra-dev/201210.mbox/%3ccajhhpg20rrcajqjdnf8sf7wnhblo6j+aofksgbxyxwcoocg...@mail.gmail.com%3E I also wrote a couple blogs

Re: 1000's of column families

2012-10-02 Thread Brian O'Neill
Without putting too much thought into it... Given the underlying architecture, I think you could/would have to write your own partitioner, which would partition based on the prefix/virtual keyspace. -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science

Re: 1000's of CF's. virtual CFs do NOT workŠ..map/reduce

2012-10-02 Thread Brian O'Neill
Dean, Great point. I hadn't considered that either. Per my other email, think we would need a custom partitioner for this? (a mix of OrderPreservingPartitioner and RandomPartitioner, OPP for the prefix) -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science

Re: 1000's of column families

2012-10-02 Thread Brian O'Neill
Agreed. Do we know yet what the overhead is for each column family? What is the limit? If you have a SINGLE keyspace w/ 2+ CF's, what happens? Anyone know? -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon

Re: 1000's of CF's. virtual CFs possible Map/Reduce SOLUTION...

2012-10-02 Thread Brian O'Neill
using Storm, let me know. We have an unreleased version of the bolt that you probably want to use. (we're waiting on Nathan/Storm to fix some classpath loading issues) RE: a customer virtual keyspace Partitioner, point well taken -brian --- Brian O'Neill Lead Architect, Software Development

Re: 1000's of column families

2012-10-02 Thread Brian O'Neill
Exactly. --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 http://www.twitter.com/boneill42 € healthmarketscience.com This information transmitted

Re: 1000's of column families

2012-10-01 Thread Brian O'Neill
Dean, We have the same question... We have thousands of separate feeds of data as well (20,000+). To date, we've been using a CF per feed strategy, but as we scale this thing out to accommodate all of those feeds, we're trying to figure out if we're going to blow out the memory. The initial

Re: 1000's of column families

2012-10-01 Thread Brian O'Neill
Its just a convenient way of prefixing: http://hector-client.github.com/hector/build/html/content/virtual_keyspaces.html -brian On Mon, Oct 1, 2012 at 4:22 PM, Ben Hood 0x6e6...@gmail.com wrote: Brian, On Mon, Oct 1, 2012 at 4:22 PM, Brian O'Neill b...@alumni.brown.edu wrote: We haven't

Re: Using the commit log for external synchronization

2012-09-21 Thread Brian O'Neill
@aaronmorton http://www.thelastpickle.com On 21/09/2012, at 11:51 AM, Brian O'Neill b...@alumni.brown.edu wrote: Along those lines... We sought to use triggers for external synchronization. If you read through this issue: https://issues.apache.org/jira/browse/CASSANDRA-1311 You'll see

Re: Kundera 2.1 released

2012-09-21 Thread Brian O'Neill
Well done, Vivek and team!! This release was much anticipated. I'll give this a test with Spring Data JPA when I return from vacation. thanks, -brian On Sep 21, 2012, at 9:15 PM, Vivek Mishra wrote: Hi All, We are happy to announce release of Kundera 2.0.7. Kundera is a JPA 2.0

Re: Using the commit log for external synchronization

2012-09-20 Thread Brian O'Neill
Along those lines... We sought to use triggers for external synchronization. If you read through this issue: https://issues.apache.org/jira/browse/CASSANDRA-1311 You'll see the idea of leveraging a commit log for synchronization, via triggers. We went ahead and implemented this concept in:

Re: Data Modeling - JSON vs Composite columns

2012-09-19 Thread Brian O'Neill
Roshni, We're going through the same debate right now. I believe native support for JSON (or collections) is on the docket for Cassandra. Here is a discussion we had a few months ago on the topic: http://comments.gmane.org/gmane.comp.db.cassandra.devel/5233 We presently store JSON, but we're

Re: Solr Use Cases

2012-09-19 Thread Brian O'Neill
Roshni, We're using SOLR to support ad hoc queries and fuzzy searches against unstructured data stored in Cassandra. Cassandra is great for storage and you can create data models and indexes that support your queries, provided you can anticipate those queries. When you can't anticipate the

Compound Keys: Connecting the dots between CQL3 and Java APIs

2012-09-11 Thread Brian O'Neill
Our data architects (ex-Oracle DBA types) are jumping on the CQL3 bandwagon and creating schemas for us. That triggered me to write a quick article mapping the CQL3 schemas to how they are accessed via Java APIs (for our dev team). I hope others find this useful as well:

Re: Cassandra API Library.

2012-09-04 Thread Brian O'Neill
You got it. (done) -brian On Tue, Sep 4, 2012 at 7:08 AM, Filipe Gonçalves the.wa.syndr...@gmail.com wrote: @Brian: you can add the Cassandra::Simple Perl client http://fmgoncalves.github.com/p5-cassandra-simple/ 2012/8/27 Paolo Bernardi berna...@gmail.com On 08/23/2012 01:40 PM, Thomas

Re: Spring - cassandra

2012-08-30 Thread Brian O'Neill
to email me directly so we don't spam this list. (or setup a googlegroup just in case others want to contribute) -brian --- Brian O'Neill Lead Architect, Software Development Apache Cassandra MVP Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M

Re: Spring - cassandra

2012-08-29 Thread Brian O'Neill
You looking for the author of Spring Data Cassandra? https://github.com/boneill42/spring-data-cassandra If so, I guess that is me. =) -brian --- Brian O'Neill Lead Architect, Software Development Apache Cassandra MVP Health Market Science The Science of Better Results 2700 Horizon Drive

Re: Cassandra API Library.

2012-08-23 Thread Brian O'Neill
We've used 'em all andŠ (IMHO) 1) I would avoid Thrift directly. 2) Hector is a sure bet. 3) Astyanax is the up and comer. 4) Kundera is good, but works like an ORM -- so not so good if your columns aren't defined ahead of time. -brian --- Brian O'Neill Lead Architect, Software Development

Re: Cassandra API Library.

2012-08-23 Thread Brian O'Neill
Thanks Dean… I hadn't played with that one. I wonder if that would better fit the bill for the Spring Data Cassandra module I'm hacking on. https://github.com/boneill42/spring-data-cassandra I'll poke around. -brian --- Brian O'Neill Lead Architect, Software Development Health Market

Re: Cassandra API Library.

2012-08-23 Thread Brian O'Neill
FWIW.. I just threw this together... http://brianoneill.blogspot.com/2012/08/cassandra-apis-laundry-list.html Let me know if I missed any others. (I didn't have playorm on there) -brian On Thu, Aug 23, 2012 at 9:51 AM, Brian O'Neill boneil...@gmail.com wrote: Thanks Dean… I hadn't played

Re: Cassandra API Library.

2012-08-23 Thread Brian O'Neill
Ha… how could I forget? =) Adding it now. --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive • King of Prussia, PA • 19406 M: 215.588.6024 • @boneill42 http://www.twitter.com/boneill42 • healthmarketscience.com

A Big Data Trifecta: Storm, Kafka and Cassandra

2012-08-04 Thread Brian O'Neill
Philip, I figured I would reply via blog post. =) http://brianoneill.blogspot.com/2012/08/a-big-data-trifecta-storm-kafka-and.html That blog post shows how we pieced together Kafka and Cassandra (via Storm). With LinkedIn behind Kafka, it is well supported. They use it in production. (and most

Re: How to process new rows in parallel?

2012-08-03 Thread Brian O'Neill
If you are deleting the messages after processing, it sounds like you are using Cassandra as a work queue. Here are some links for implementing a distributed queue in Cassandra: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Distributed-work-queues-td5226248.html

Re: How to manually build and maintain secondary indexes

2012-07-26 Thread Brian O'Neill
. It doesn't address all of your concerns, but I tried to capture the motivation behind our implementation here: http://brianoneill.blogspot.com/2012/03/cassandra-indexing-good-bad-and-ugl y.html -brian -- Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King

An experiment using Spring Data w/ Cassandra (initially via JPA/Kundera)

2012-07-18 Thread Brian O'Neill
This is just an FYI. I experimented w/ Spring Data JPA w/ Cassandra leveraging Kundera. It sort of worked: https://github.com/boneill42/spring-data-jpa-cassandra http://brianoneill.blogspot.com/2012/07/spring-data-w-cassandra-using-jpa.html I'm now working on a pure Spring Data adapter using

Re: Trigger and customized filter

2012-07-10 Thread Brian O'Neill
While Jonathan and crew work on the infrastructure to support triggers: https://issues.apache.org/jira/browse/CASSANDRA-4285 We have a project going over here that provides a trigger-like capability: https://github.com/hmsonline/cassandra-triggers/

Re: Cassandra and Tableau

2012-07-06 Thread Brian O'Neill
Robin, We have the same issue right now. We use Tableau for all of our reporting needs, but we couldn't find any acceptable bridge between it and Cassandra. We ended up using cassandra-triggers to replicate the data to Oracle. https://github.com/hmsonline/cassandra-triggers/ Let us know if you

Re: which high level Java client

2012-06-28 Thread Brian O'Neill
FWIW, We keep most of our system level integrations behind REST using Virgil: https://github.com/hmsonline/virgil When a lower-level integration is necessary we use Hector, but recently we've started using Astyanax and plan to port our Hector dependencies over to Astyanax when given a chance.

Re: Ball is rolling on High Performance Cassandra Cookbook second edition

2012-06-27 Thread Brian O'Neill
RE: API method signatures changing That triggers another thought... What terminology will you use in the book to describe the data model? CQL? When we wrote the RefCard on DZonehttp://refcardz.dzone.com/refcardz/apache-cassandra, we intentionally favored/used CQL terminology. On advisement

Indexing JSON in Cassandra

2012-06-21 Thread Brian O'Neill
I know we had this conversation over on the dev list a while back: http://www.mail-archive.com/dev@cassandra.apache.org/msg03914.html I just wanted to let people know that we added the capability to our cassandra-indexing extension.

Re: Server Side Logic/Script - Triggers / StoreProc

2012-04-22 Thread Brian O'Neill
Praveen, We are certainly interested. To get things moving we implemented an add-on for Cassandra to demonstrate the viability (using AOP): https://github.com/hmsonline/cassandra-triggers Right now the implementation executes triggers asynchronously, allowing you to implement a java interface

Re: cassandra gui

2012-04-01 Thread Brian O'Neill
If you give Virgil a try, let me know how it goes. The REST layer is pretty solid, but the gui is just a PoC which makes it easy to see what's in the CFs during development/testing. (It's only a couple hundred lines of ExtJS code built on the REST layer) We had plans to add CQL to the gui for

Cassandra Triggers Capability published out to GitHub

2012-03-02 Thread Brian O'Neill
FYI -- http://brianoneill.blogspot.com/2012/03/cassandra-triggers-for-indexing-and.html https://github.com/hmsonline/cassandra-triggers Feedback welcome. Contribution and involvement is even better. ;) -brian -- Brian ONeill Lead Architect, Health Market Science

Virgil Moved (and Cassandra-Triggers coming soon)

2012-02-07 Thread Brian O'Neill
FYI -- we moved Virgil to Github to make it easier for people to contribute. https://github.com/hmsonline/virgil Also, we created an organization profile (hmsonline) to house all of our storm/cassandra related work. https://github.com/hmsonline Under that profile, we'll be releasing

Remote Hadoop Job Deployment

2012-01-24 Thread Brian O'Neill
FYI... we finally got around to releasing a version of Virgil that includes the ability to deploy jobs to remote Hadoop clusters running against Cassandra Column Families. http://brianoneill.blogspot.com/2012/01/virgil-remote-hadoop-job-deployment-via.html This has enabled an army of people to

Re: Cassandra to Oracle?

2012-01-22 Thread Brian O'Neill
call ad-hoc queries. Regards, Maxim On 1/20/2012 9:28 AM, Brian O'Neill wrote: I can't remember if I asked this question before, but We're using Cassandra as our transactional system, and building up quite a library of map/reduce jobs that perform data quality analysis

Re: Cassandra to Oracle?

2012-01-22 Thread Brian O'Neill
RDBMS which doesn't scale very well for what you call ad-hoc queries. Regards, Maxim On 1/20/2012 9:28 AM, Brian O'Neill wrote: I can't remember if I asked this question before, but We're using Cassandra as our transactional system, and building up quite a library of map

Re: Cassandra to Oracle?

2012-01-22 Thread Brian O'Neill
Good point Milind. (RE: Client-side AOP) I was thinking server-side to stay with the trigger concept, but we could just as easily intercept on the client-side. We'd just need to make sure that all clients got the AOP code injected. (including all of our map/reduce jobs) If we get the

Cassandra to Oracle?

2012-01-20 Thread Brian O'Neill
I can't remember if I asked this question before, but We're using Cassandra as our transactional system, and building up quite a library of map/reduce jobs that perform data quality analysis, statistics, etc. ( 100 jobs now) But... we are still struggling to provide an ad-hoc query mechanism

Re: Cassandra to Oracle?

2012-01-20 Thread Brian O'Neill
benchmark would be helpful) -brian On Fri, Jan 20, 2012 at 12:41 PM, Zach Richardson j.zach.richard...@gmail.com wrote: How much data do you think you will need ad hoc query ability for? On Fri, Jan 20, 2012 at 11:28 AM, Brian O'Neill b...@alumni.brown.eduwrote: I can't remember if I asked

Ad Hoc Queries

2012-01-20 Thread Brian O'Neill
? (even a simple count / or copy CF benchmark would be helpful) -brian On Fri, Jan 20, 2012 at 12:41 PM, Zach Richardson j.zach.richard...@gmail.com wrote: How much data do you think you will need ad hoc query ability for? On Fri, Jan 20, 2012 at 11:28 AM, Brian O'Neill b

Triggers?

2012-01-20 Thread Brian O'Neill
Anyone know if there is any activity to deliver triggers? I saw this quote: http://www.readwriteweb.com/cloud/2011/10/cassandra-reaches-10-whats-nex.php Ellis says that he's just starting to think about the post-1.0 world for Cassandra. Two features do come to mind, though, that missed the boat

Copy a column family?

2012-01-09 Thread Brian O'Neill
What is the fastest way to copy a column family? We were headed down the map/reduce path, but that seems silly. Any file level mechanisms for this? -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog:

Re: Copy a column family?

2012-01-09 Thread Brian O'Neill
Excellent. We'll give it a try. Thanks Brandon. -brian Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406 p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/ On 1

  1   2   >