Re: are there any free Cassandra -> ElasticSearch connector / plugin ?

2016-10-13 Thread Brian O'Neill
I haven't used it yet, but https://github.com/vroyer/elassandra <https://github.com/vroyer/elassandra> -- Brian O'Neill Principal Architect @ Monetate m: 215.588.6024 bone...@monetate.com <mailto:bone...@monetate.com> > On Oct 13, 2016, at 6:02 PM, Eric Ho wr

Re: Trigger and customized filter

2012-07-10 Thread Brian O'Neill
While Jonathan and crew work on the infrastructure to support triggers: https://issues.apache.org/jira/browse/CASSANDRA-4285 We have a project going over here that provides a trigger-like capability: https://github.com/hmsonline/cassandra-triggers/ https://github.com/hmsonline/cassandra-triggers/w

An experiment using Spring Data w/ Cassandra (initially via JPA/Kundera)

2012-07-18 Thread Brian O'Neill
This is just an FYI. I experimented w/ Spring Data JPA w/ Cassandra leveraging Kundera. It sort of worked: https://github.com/boneill42/spring-data-jpa-cassandra http://brianoneill.blogspot.com/2012/07/spring-data-w-cassandra-using-jpa.html I'm now working on a pure Spring Data adapter using Ast

Re: How to manually build and maintain secondary indexes

2012-07-26 Thread Brian O'Neill
doesn't address all of your concerns, but I tried to capture the motivation behind our implementation here: http://brianoneill.blogspot.com/2012/03/cassandra-indexing-good-bad-and-ugl y.html -brian -- Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Dr

Re: How to process new rows in parallel?

2012-08-03 Thread Brian O'Neill
If you are deleting the messages after processing, it sounds like you are using Cassandra as a work queue. Here are some links for implementing a distributed queue in Cassandra: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Distributed-work-queues-td5226248.html http://comments.

A Big Data Trifecta: Storm, Kafka and Cassandra

2012-08-04 Thread Brian O'Neill
Philip, I figured I would reply via blog post. =) http://brianoneill.blogspot.com/2012/08/a-big-data-trifecta-storm-kafka-and.html That blog post shows how we pieced together Kafka and Cassandra (via Storm). With LinkedIn behind Kafka, it is well supported. They use it in production. (and most l

Re: Cassandra API Library.

2012-08-23 Thread Brian O'Neill
We've used 'em all andŠ (IMHO) 1) I would avoid Thrift directly. 2) Hector is a sure bet. 3) Astyanax is the up and comer. 4) Kundera is good, but works like an ORM -- so not so good if your columns aren't defined ahead of time. -brian --- Brian O'Neill Lead Architect,

Re: Cassandra API Library.

2012-08-23 Thread Brian O'Neill
Thanks Dean… I hadn't played with that one. I wonder if that would better fit the bill for the Spring Data Cassandra module I'm hacking on. https://github.com/boneill42/spring-data-cassandra I'll poke around. -brian --- Brian O'Neill Lead Architect, Software Develop

Re: Cassandra API Library.

2012-08-23 Thread Brian O'Neill
FWIW.. I just threw this together... http://brianoneill.blogspot.com/2012/08/cassandra-apis-laundry-list.html Let me know if I missed any others. (I didn't have playorm on there) -brian On Thu, Aug 23, 2012 at 9:51 AM, Brian O'Neill wrote: > > Thanks Dean… I hadn't pla

Re: Cassandra API Library.

2012-08-23 Thread Brian O'Neill
Ha… how could I forget? =) Adding it now. --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive • King of Prussia, PA • 19406 M: 215.588.6024 • @boneill42 <http://www.twitter.com/boneill42> • healthmarket

Re: Spring - cassandra

2012-08-29 Thread Brian O'Neill
You looking for the author of Spring Data Cassandra? https://github.com/boneill42/spring-data-cassandra If so, I guess that is me. =) -brian --- Brian O'Neill Lead Architect, Software Development Apache Cassandra MVP Health Market Science The Science of Better Results 2700 Horizon

Re: Spring - cassandra

2012-08-30 Thread Brian O'Neill
ahead and fork. Feel free to email me directly so we don't spam this list. (or setup a googlegroup just in case others want to contribute) -brian --- Brian O'Neill Lead Architect, Software Development Apache Cassandra MVP Health Market Science The Science of Better Results 2700 Horiz

Re: Cassandra API Library.

2012-09-04 Thread Brian O'Neill
You got it. (done) -brian On Tue, Sep 4, 2012 at 7:08 AM, Filipe Gonçalves wrote: > @Brian: you can add the Cassandra::Simple Perl client > http://fmgoncalves.github.com/p5-cassandra-simple/ > > > 2012/8/27 Paolo Bernardi >> >> On 08/23/2012 01:40 PM, Thomas Spengler wrote: >>> >>> 4) pelops (

Compound Keys: Connecting the dots between CQL3 and Java APIs

2012-09-11 Thread Brian O'Neill
Our data architects (ex-Oracle DBA types) are jumping on the CQL3 bandwagon and creating schemas for us. That triggered me to write a quick article mapping the CQL3 schemas to how they are accessed via Java APIs (for our dev team). I hope others find this useful as well: http://brianoneill.blogsp

Re: Data Modeling - JSON vs Composite columns

2012-09-19 Thread Brian O'Neill
Roshni, We're going through the same debate right now. I believe native support for JSON (or collections) is on the docket for Cassandra. Here is a discussion we had a few months ago on the topic: http://comments.gmane.org/gmane.comp.db.cassandra.devel/5233 We presently store JSON, but we're con

Re: Solr Use Cases

2012-09-19 Thread Brian O'Neill
Roshni, We're using SOLR to support ad hoc queries and fuzzy searches against unstructured data stored in Cassandra. Cassandra is great for storage and you can create data models and indexes that support your queries, provided you can anticipate those queries. When you can't anticipate the queri

Re: Using the commit log for external synchronization

2012-09-20 Thread Brian O'Neill
Along those lines... We sought to use triggers for external synchronization. If you read through this issue: https://issues.apache.org/jira/browse/CASSANDRA-1311 You'll see the idea of leveraging a commit log for synchronization, via triggers. We went ahead and implemented this concept in:

Re: Using the commit log for external synchronization

2012-09-21 Thread Brian O'Neill
elps. > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 21/09/2012, at 11:51 AM, Brian O'Neill wrote: > > > Along those lines... > > We sought to use triggers for external synchronization. If you

Re: Kundera 2.1 released

2012-09-21 Thread Brian O'Neill
Well done, Vivek and team!! This release was much anticipated. I'll give this a test with Spring Data JPA when I return from vacation. thanks, -brian On Sep 21, 2012, at 9:15 PM, Vivek Mishra wrote: > Hi All, > > We are happy to announce release of Kundera 2.0.7. > > Kundera is a JPA 2.0 b

Re: 1000's of column families

2012-10-01 Thread Brian O'Neill
Dean, We have the same question... We have thousands of separate feeds of data as well (20,000+). To date, we've been using a CF per feed strategy, but as we scale this thing out to accommodate all of those feeds, we're trying to figure out if we're going to blow out the memory. The initial doc

Re: 1000's of column families

2012-10-01 Thread Brian O'Neill
Its just a convenient way of prefixing: http://hector-client.github.com/hector/build/html/content/virtual_keyspaces.html -brian On Mon, Oct 1, 2012 at 4:22 PM, Ben Hood <0x6e6...@gmail.com> wrote: > Brian, > > On Mon, Oct 1, 2012 at 4:22 PM, Brian O'Neill wrote: >> W

Re: 1000's of column families

2012-10-02 Thread Brian O'Neill
Without putting too much thought into it... Given the underlying architecture, I think you could/would have to write your own partitioner, which would partition based on the prefix/virtual keyspace. -brian --- Brian O'Neill Lead Architect, Software Development Health Market Scienc

Re: 1000's of CF's. virtual CFs do NOT workŠ..map/reduce

2012-10-02 Thread Brian O'Neill
Dean, Great point. I hadn't considered that either. Per my other email, think we would need a custom partitioner for this? (a mix of OrderPreservingPartitioner and RandomPartitioner, OPP for the prefix) -brian --- Brian O'Neill Lead Architect, Software Development Health Market S

Re: 1000's of column families

2012-10-02 Thread Brian O'Neill
Agreed. Do we know yet what the overhead is for each column family? What is the limit? If you have a SINGLE keyspace w/ 2+ CF's, what happens? Anyone know? -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 27

Re: 1000's of CF's. virtual CFs possible Map/Reduce SOLUTION...

2012-10-02 Thread Brian O'Neill
nd up using Storm, let me know. We have an unreleased version of the bolt that you probably want to use. (we're waiting on Nathan/Storm to fix some classpath loading issues) RE: a customer virtual keyspace Partitioner, point well taken -brian --- Brian O'Neill Lead Architect, Softw

Re: 1000's of column families

2012-10-02 Thread Brian O'Neill
Exactly. --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42> € healthmarketscience.com This information transmit

Re: Using compound primary key

2012-10-08 Thread Brian O'Neill
Hey Vivek, The same thing happened to me the other day. You may be missing a component in your compound key. See this thread: http://mail-archives.apache.org/mod_mbox/cassandra-dev/201210.mbox/%3ccajhhpg20rrcajqjdnf8sf7wnhblo6j+aofksgbxyxwcoocg...@mail.gmail.com%3E I also wrote a couple blogs

Keeping the record straight for Cassandra Benchmarks...

2012-10-25 Thread Brian O'Neill
People probably saw... http://www.networkworld.com/cgi-bin/mailto/x.cgi?pagetosend=/news/tech/2012/102212-nosql-263595.html To clarify things take a look at... http://brianoneill.blogspot.com/2012/10/solid-nosql-benchmarks-from-ycsb-w-side.html -brian -- Brian ONeill Lead Architect, Health Mark

Re: logging servers? any interesting in one for cassandra?

2012-11-06 Thread Brian O'Neill
e over the next few months. -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42> € healthmarketscience.

Re: logging servers? any interesting in one for cassandra?

2012-11-07 Thread Brian O'Neill
Thanks Dean. We'll definitely take a look. (probably in January) -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive • King of Prussia, PA • 19406 M: 215.588.6024 • @boneill42 <http://www.twitter.

Indexing Data in Cassandra with Elastic Search

2012-11-08 Thread Brian O'Neill
For those looking to index data in Cassandra with Elastic Search, here is what we decided to do: http://brianoneill.blogspot.com/2012/11/big-data-quadfecta-cassandra-storm.html -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog:

Re: [BETA RELEASE] Apache Cassandra 1.2.0-beta2 released

2012-11-10 Thread Brian O'Neill
Wow...good catch. We had puppet scripts which automatically assigned the proper tokens given the cluster size. What is the range now? Got a link? -brian On Nov 10, 2012, at 9:27 PM, Edward Capriolo wrote: > just a note for all. The default partitioner is no longer randompartitioner. > It is

Re: Datatype Conversion in CQL-Client?

2012-11-18 Thread Brian O'Neill
If you are talking about the CQL-client that comes with Cassandra (cqlsh), it is actually written in Python: https://github.com/apache/cassandra/blob/trunk/bin/cqlsh For information on datatypes (and conversion) take a look at the CQL definition: http://www.datastax.com/docs/1.0/references/cql/i

Re: Datatype Conversion in CQL-Client?

2012-11-19 Thread Brian O'Neill
I don't think Michael and/or Jonathan have published the CQL java driver yet. (CCing them) Hopefully they'll find a public home for it soon, I hope to include it in the Webinar in December. (http://www.datastax.com/resources/webinars/collegecredit) -brian --- Brian O'Neill

Re: Datatype Conversion in CQL-Client?

2012-11-19 Thread Brian O'Neill
here the code is to apply that metadata to the result set in: https://github.com/apache/cassandra/blob/trunk/interface/thrift/gen-java/org /apache/cassandra/thrift/CqlResult.java -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Result

Re: Datastax Java Driver

2012-11-19 Thread Brian O'Neill
Woohoo! Thanks for making this available. --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42> € healthmarketscience

Re: Datatype Conversion in CQL-Client?

2012-11-19 Thread Brian O'Neill
-brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42> € healthmarketscience.com This information transmit

Datastax C*ollege Credit Webinar Series : Create your first Java App w/ Cassandra

2012-12-12 Thread Brian O'Neill
FWIW -- I'm presenting tomorrow for the Datastax C*ollege Credit Webinar Series: http://brianoneill.blogspot.com/2012/12/presenting-for-datastax-college-credit.html I hope to make CQL part of the presentation and show how it integrates with the Java APIs. If you are interested, drop in. -brian -

Re: Best Java Driver for Cassandra?

2012-12-13 Thread Brian O'Neill
made available afterwards. I also have a laundry list here: (written before I knew about Firebrand) http://brianoneill.blogspot.com/2012/08/cassandra-apis-laundry-list.html -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 27

Re: CQL3 Compound Primary Keys - Do I have the right idea?

2012-12-22 Thread Brian O'Neill
Agreed. I actually flip between cli and cqlsh these days. cqlsh shows the logical view. cli shows the physical view. This is useful, especially when developing using a thrift-based client. Here are the slides and video if you want to have a look. -brian On Dec 22, 2012, at 3:36 AM, Wz1975

Re: Astyanax

2013-01-08 Thread Brian O'Neill
013/01/creating-your-frist-java-application -w.html -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42> € heal

Re: Cassandra 1.2 Thrift and CQL 3 issue

2013-01-12 Thread Brian O'Neill
I reported the issue here. You may be missing a component in your column name. https://issues.apache.org/jira/browse/CASSANDRA-5138 -brian On Jan 12, 2013, at 12:48 PM, Shahryar Sedghi wrote: > Hi > > I am trying to test my application that runs with JDBC, CQL 3 with Cassandra > 1.2. After

Webinar: Using Storm for Distributed Processing on Cassandra

2013-01-16 Thread Brian O'Neill
Just an FYI -- We will be hosting a webinar tomorrow demonstrating the use of Storm as a distributed processing layer on top of Cassandra. I'll be tag teaming with Taylor Goetz, the original author of storm-cassandra. http://www.datastax.com/resources/webinars/collegecredit It is part of the C*o

Re: Accessing Metadata of Column Familes

2013-01-28 Thread Brian O'Neill
Through CQL, you see the logical schema. Through CLI, you see the physical schema. This may help: http://www.datastax.com/dev/blog/cql3-for-cassandra-experts -brian On Mon, Jan 28, 2013 at 7:26 AM, Rishabh Agrawal wrote: > I found following issues while working on Cassandra version 1.2, CQL 3 a

Re: cql: show tables in a keystone

2013-01-28 Thread Brian O'Neill
cqlsh> use keyspace; cqlsh:cirrus> describe tables; For more info: cqlsh> help describe -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @b

Re: Netflix/Astynax Client for Cassandra

2013-02-07 Thread Brian O'Neill
Incidentally, we run Astyanax against 1.2.1. We haven't had any issues. When running against 1.2.0, we ran into this: https://github.com/Netflix/astyanax/issues/191 -brian --- Brian O'Neill Lead Architect, Software Development Health Market Science The Science of Better Results 27

Re: [ANN] SparkSQL support for Cassandra with Calliope

2014-10-03 Thread Brian O'Neill
Well done Rohit. (and crew) -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42> € healthmarketscience.com This

Re: IF NOT EXISTS on UPDATE statements?

2014-11-18 Thread Brian O'Neill
would love to see: UPSERT value=new_value where (not exists || value=read_value) (ignoring some intricacies) -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42

Re: IF NOT EXISTS on UPDATE statements?

2014-11-18 Thread Brian O'Neill
Exactly. Perfect. Will do. Thanks Robert. -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42> € healthmarketscience

Re: cassandra source code

2015-03-24 Thread Brian O'Neill
FWIW ‹ I just went through this, and posted the process I used to get up and running: http://brianoneill.blogspot.com/2015/03/getting-started-with-cassandra.html -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42

Re: Frequent timeout issues

2015-04-01 Thread Brian O'Neill
Are you using the storm-cassandra-cql driver? (https://github.com/hmsonline/storm-cassandra-cql) If so, what version? Batching or no batching? -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 <http://www.twi

Re: Cassandra - Storm

2015-04-03 Thread Brian O'Neill
I¹d recommend using Storm¹s State abstraction. Check out: https://github.com/hmsonline/storm-cassandra-cql -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 <http://www.twitter.com/boneill42> This

Re: Adhoc querying in Cassandra?

2015-04-22 Thread Brian O'Neill
+1, I think many organizations (including ours) pair Elastic Search with Cassandra. Use Cassandra as your system of record, then index the data with ES. -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42

Re: Adhoc querying in Cassandra?

2015-04-22 Thread Brian O'Neill
Again ‹ agreed. They have different usage patterns (C* heavy writes, ES heavy read), I would separate them. SOLR should be sufficient. I believe DSE is a tight integration between SOLR and C*. -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Co

Re: cassandra and spark from cloudera distirbution

2015-04-22 Thread Brian O'Neill
Depends which veresion of Spark you are running on Cloudera. Once you know that ‹ have a look at the compatibility chart here: https://github.com/datastax/spark-cassandra-connector -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 M

Re: Spark SQL JDBC Server + DSE

2015-05-28 Thread Brian O'Neill
assume you need JDBC connectivity specifically? -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 <http://www.twitter.com/boneill42> This information transmitted in this email message is for the intended recipie

Re: Spark SQL JDBC Server + DSE

2015-05-30 Thread Brian O'Neill
- thrift-jdbcodbc-server I wouldn¹t mind collaborating on that, if you are headed in that direction. (and then I could write the REST server on top of that) LMK, -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42

Re: Spark SQL JDBC Server + DSE

2015-06-03 Thread Brian O'Neill
Kudos Ben. We¹ve been tracking Zeppelin, and considered doing the same thing. You beat us to it. Well done. -brian --- Brian O'Neill Chief Technology Officer Health Market Science, a LexisNexis Company 215.588.6024 Mobile € @boneill42 <http://www.twitter.com/boneill42> This

Re: Support for ad-hoc query

2015-06-09 Thread Brian O'Neill
Cassandra isn¹t great at ad hoc queries. Many of us have paired it with an indexing engine like SOLR or Elastic Search. (built-into the DSE solution) As of late, I think there are a few of us exploring Spark SQL. (which you can then use via JDBC or REST) -brian --- Brian O'Neill

Drop keyspace via CQL hanging on master/trunk.

2013-12-05 Thread Brian O'Neill
27;replication_factor':'1'}; cqlsh> describe keyspaces; system test_keyspace system_traces cqlsh> drop keyspace test_keyspace; thoughts? user error? worth filing an issue? One other note ‹ this happens using the CQL java driver as well. -brian --- Brian O'Neill

Re: Drop keyspace via CQL hanging on master/trunk.

2013-12-05 Thread Brian O'Neill
I removed the data directory just to make sure I had a clean environment. (eliminating the possibility of corrupt keyspaces/files causing problems) -brian --- Brian O'Neill Chief Architect Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19

Re: Drop keyspace via CQL hanging on master/trunk.

2013-12-10 Thread Brian O'Neill
Great. Thanks Aaron. FWIW, I am/was porting Virgil over CQL. I should be able to release a new REST API for C* (using CQL) shortly. -brian --- Brian O'Neill Chief Architect Health Market Science The Science of Better Results 2700 Horizon Drive • King of Prussia, PA • 19406 M: 215.588

Dimensional SUM, COUNT, & DISTINCT in C* (replacing Acunu)

2013-12-17 Thread Brian O'Neill
We are seeking to replace Acunu in our technology stack / platform. It is the only component in our stack that is not open source. In preparation, over the last few weeks I’ve migrated Virgil to CQL. The vision is that Virgil could receive a REST request to upsert/delete data (hierarchical JSON

Re: Dimensional SUM, COUNT, & DISTINCT in C* (replacing Acunu)

2013-12-18 Thread Brian O'Neill
¹ll continue the discussion on the issue. thanks again, brian --- Brian O'Neill Chief Architect Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 <http://www.twitter.com/boneill42> € healthmarketscience

Re: CQL list command

2014-02-07 Thread Brian O'Neill
+1, agreed. I do the same thing. If cli is going away, we¹ll need this ability in cqlsh. I created a JIRA issue for it: https://issues.apache.org/jira/browse/CASSANDRA-6676 We¹ll see what the crew come back with. -brian --- Brian O'Neill Chief Technology Officer Health Market Scienc

[Blog] : Storm and Cassandra : A Three Year Retrospective

2014-02-13 Thread Brian O'Neill
A community member asked for a blog post on Storm + Cassandra. FWIW, here was our journey. http://brianoneill.blogspot.com/2014/02/storm-and-cassandra-three-year.html -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon

Re: Proposal: freeze Thrift starting with 2.1.0

2014-03-12 Thread Brian O'Neill
queries using more familiar syntax. (including future things such as joins, grouping, etc.) To me, that is exciting, and again ‹ one of the reasons we are leaning on it. my two cents, brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results

Re: Cassandra blob storage

2014-03-18 Thread Brian O'Neill
You may want to look at: https://github.com/Netflix/astyanax/wiki/Chunked-Object-Store -brian --- Brian O'Neill Chief Technology Officer Health Market Science The Science of Better Results 2700 Horizon Drive € King of Prussia, PA € 19406 M: 215.588.6024 € @boneill42 <http://www.twi

Re: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Brian O'Neill
1.0 FTW! Nice work. -brian On Tue, Oct 18, 2011 at 8:32 AM, Viktor Jevdokimov < viktor.jevdoki...@adform.com> wrote: > Congrats!!! > > > Best regards/ Pagarbiai > > Viktor Jevdokimov > Senior Developer > > Email: viktor.jevdoki...@adform.com > Phone: +370 5 212 3063 > Fax: +370 5 261 0453 > > J.

Re: Using elasticsearch on cassandra nodes

2011-10-18 Thread Brian O'Neill
Anthony, We've been looking at elastic search as well. Presently we have SOLR in place, but it is cumbersome dealing with SOLR schemas when indexing information out of Cassandra (since you can't anticipate all the columns ahead of time). What are you using as your bridge between Cassandra and ES

Re: Using elasticsearch on cassandra nodes

2011-10-19 Thread Brian O'Neill
findings as well. cheers, -brian Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406 p: 215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/ From: Anthony Ikeda Reply-To

Re: Using elasticsearch on cassandra nodes

2011-10-21 Thread Brian O'Neill
r/conf/schema.xml#L538 > > But you may still want to define a schema so you can adjust the index and > query time processing/typing of the field values. > > Cheers > > - > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com

REST API on the Server Side

2011-10-25 Thread Brian O'Neill
Sasha, Thanks for the feedback. I can appreciate your comment on connection pooling, and it is certainly a matter of taste/purpose/perspective. In our case, it helps to have the REST layer because its a more natural fit into our platform/ecosystem (considering we use COTS ETL tools, workflows, e

Value-Added Services Layer

2011-10-25 Thread Brian O'Neill
Sasha, Thinking a little more about "what problem the REST API solves"... To be honest, I agree completely. I don't think a REST layer that provides the same feature/function as CQL is all that valuable except in cases like I described (which may not be all that common). Also, to be honest, I d

R on Cassandra

2011-11-01 Thread Brian O'Neill
I saw a mention of R on Cassandra: http://comments.gmane.org/gmane.comp.db.cassandra.user/5681 Does anyone know if this has traction somewhere? -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/bo

Re: Tool for SQL -> Cassandra data movement

2011-11-02 Thread Brian O'Neill
ob can load the data directly into Cassandra. (using a ColumnFamilyOutput format) We are solving this problem right now, so I'll report back. -brian ---- Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406 p: 21

Cassandra Integration w/ SOLR using Virgil

2011-11-04 Thread Brian O'Neill
Up front, I'd like to say this is still pretty raw. We'd love to get feedback (and better yet contributions ;). With that as a disclaimer, I added SOLR integration to Virgil. When you add and delete rows and columns via the REST interface, an index is updated in SOLR. For more information check

Re: Second Cassandra users survey

2011-11-07 Thread Brian O'Neill
It should be dead-simple to build a slick GUI on the REST layer. (@Virgil ) I had planned to crank one out this week (using ExtJS) that mimicked the Squirrel/Toad look and feel. The UI would have a tree-panel of keyspaces and column families o

Re: security

2011-11-09 Thread Brian O'Neill
Not sure this is the "standard approach", probably more "what we came up with". ;) We plan to deploy Cassandra behind a firewall denying all traffic on all ports other than 8080. Access from applications will be limited to the REST/HTTP layer, which we'll lock down with standard HTTP authenticati

GUI for Cassandra now included in Virgil

2011-11-21 Thread Brian O'Neill
I got around to implementing a GUI for Cassandra in Virgil. It was really simple. (100 lines of javascript) You can see a screenshot here: http://code.google.com/a/apache-extras.org/p/virgil/wiki/gui For those that were looking for a way to embed data visualization into their applications, you ca

Added run-modes to Virgil: Run embedded or against a remote Cassandra.

2011-11-28 Thread Brian O'Neill
I'm not sure if this was preventing anyone from using Virgil, but we added run-modes to Virgil to accomodate users that have an existing cluster. Now, you can just point Virgil at the remote instance and use it only for the GUI/REST layer and SOLR integration. http://brianoneill.blogspot.com/2011/

MapReduce on Cassandra using Ruby and REST!

2011-12-01 Thread Brian O'Neill
I know I've been spamming the list a bit with new features for Virgil, but this one is actually really cool... Enamored with what Riak provides as far as map/reduce via HTTP, http://wiki.basho.com/MapReduce.html#MapReduce-via-the-HTTP-API We implemented the same thing for Virgil/Cassandra. Simp

Presentations from NYC?

2011-12-09 Thread Brian O'Neill
I may have missed it... Were the presentations posted from NYC? (Specifically, I'm looking for Nate's McCall's presentation) -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog/boneill42/ blog: http:

Binary Packages available for Virgil

2011-12-09 Thread Brian O'Neill
In response to a few requests for a binary distribution, we just posted artifacts for Virgil. http://code.google.com/a/apache-extras.org/p/virgil/downloads/list For simplicity, we're keeping the version number aligned with the version of Cassandra. (which is important when you are running with an

Re: Using Cassandra in Rails App

2011-12-15 Thread Brian O'Neill
I'm not sure this is the best answer, but all of our webapps (RoR included) access Cassandra via REST. That is one of the major reasons we built Virgil. http://code.google.com/a/apache-extras.org/p/virgil/ It allows us to build the webapps, for the most part, independent of the actual storage mec

Re: cassandra data to hadoop.

2011-12-23 Thread Brian O'Neill
I'm not sure this is much help, but we actually run Hadoop jobs to load and extract data to and from HDFS. You can use ColumnFamilyInputFormat to race over the data in Cassandra and output it to a file. That doesn't solve the continuous problem, but should give you a batch mechanism to refresh th

Re: Presentations from NYC?

2011-12-27 Thread Brian O'Neill
multidimensional metrics. > > 2011/12/10 Jonathan Ellis > Not yet -- we're working on it. > > On Fri, Dec 9, 2011 at 1:48 PM, Brian O'Neill wrote: > > > > I may have missed it... > > Were the presentations posted from NYC? > > (Specific

Re: Peregrine: A new map reduce framework for iterative/pipelined jobs.

2011-12-27 Thread Brian O'Neill
Kevin, I just pulled the code and read through the design. Great stuff. Any thought to potentially using this for real-time processing as well? Right now, we have a set of Hadoop M/R jobs that operate against Cassandra for ETL. We were looking at using Storm for the real-time processing side

Re: Hector and CQL

2012-01-05 Thread Brian O'Neill
If you are looking to add hector, you'll need: me.prettyprint hector 1.0-2 -brian ---- Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406 p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/

Copy a column family?

2012-01-09 Thread Brian O'Neill
What is the fastest way to copy a column family? We were headed down the map/reduce path, but that seems silly. Any file level mechanisms for this? -brian -- Brian ONeill Lead Architect, Health Market Science (http://healthmarketscience.com) mobile:215.588.6024 blog: http://weblogs.java.net/blog

Re: Copy a column family?

2012-01-09 Thread Brian O'Neill
Excellent. We'll give it a try. Thanks Brandon. -brian ---- Brian O'Neill Lead Architect, Software Development Health Market Science | 2700 Horizon Drive | King of Prussia, PA 19406 p: 215.588.6024blog: http://weblogs.java.net/blog/boneill42/ blog: http://brianoneill.blogspot.com/

Cassandra to Oracle?

2012-01-20 Thread Brian O'Neill
I can't remember if I asked this question before, but We're using Cassandra as our transactional system, and building up quite a library of map/reduce jobs that perform data quality analysis, statistics, etc. (> 100 jobs now) But... we are still struggling to provide an "ad-hoc" query mechani

Re: Cassandra to Oracle?

2012-01-20 Thread Brian O'Neill
benchmark would be helpful) -brian On Fri, Jan 20, 2012 at 12:41 PM, Zach Richardson < j.zach.richard...@gmail.com> wrote: > How much data do you think you will need ad hoc query ability for? > > > On Fri, Jan 20, 2012 at 11:28 AM, Brian O'Neill wrote: > >> >

Ad Hoc Queries

2012-01-20 Thread Brian O'Neill
ce against Cassandra? > (even a simple count / or copy CF benchmark would be helpful) > > -brian > > On Fri, Jan 20, 2012 at 12:41 PM, Zach Richardson < > j.zach.richard...@gmail.com> wrote: > >> How much data do you think you will need ad hoc query ability for

Triggers?

2012-01-20 Thread Brian O'Neill
Anyone know if there is any activity to deliver triggers? I saw this quote: http://www.readwriteweb.com/cloud/2011/10/cassandra-reaches-10-whats-nex.php "Ellis says that he's just starting to think about the post-1.0 world for Cassandra. Two features do come to mind, though, that missed the boat

Re: Cassandra to Oracle?

2012-01-22 Thread Brian O'Neill
it to death (because otherwise the "ad hoc" > queries won't work well if at all), and at this point you may be hit with a > performance penalty. > > It may be a good idea to interview users and build denormalized views in > Cassandra, maybe on a separate "look-up

Re: Cassandra to Oracle?

2012-01-22 Thread Brian O'Neill
the "ad hoc" > queries won't work well if at all), and at this point you may be hit with a > performance penalty. > > It may be a good idea to interview users and build denormalized views in > Cassandra, maybe on a separate "look-up" cluster. A few percent

Re: Cassandra to Oracle?

2012-01-22 Thread Brian O'Neill
evel. > > Your migration also needs to be attended to and might need a MR first and AOP > intercepted writes. > > Hth > Milind > > > /*** > sent from my android...please pardon occasional typos as I respond @ the > speed of thought >

Remote Hadoop Job Deployment

2012-01-24 Thread Brian O'Neill
FYI... we finally got around to releasing a version of Virgil that includes the ability to deploy jobs to remote Hadoop clusters running against Cassandra Column Families. http://brianoneill.blogspot.com/2012/01/virgil-remote-hadoop-job-deployment-via.html This has enabled an army of people to wr

Virgil Moved (and Cassandra-Triggers coming soon)

2012-02-07 Thread Brian O'Neill
FYI -- we moved Virgil to Github to make it easier for people to contribute. https://github.com/hmsonline/virgil Also, we created an organization profile (hmsonline) to house all of our storm/cassandra related work. https://github.com/hmsonline Under that profile, we'll be releasing cassandra-tri

  1   2   >