Re: query by column size

2015-02-13 Thread chandra Varahala
I have already secondary index on that column, but how to I query that
column by size ?

thanks
chandra

On Fri, Feb 13, 2015 at 3:30 AM, Marcelo Valle (BLOOMBERG/ LONDON) 
mvallemil...@bloomberg.net wrote:

 There is no automatic indexing in Cassandra. There are secondary indexes,
 but not for these cases.
 You could use a solution like DSE, to get data automatically indexed on
 solr, in each node, as soon as data comes. Then you could do such a query
 on solr.
 If the query can be slow, you could run a MR job over all rows, filtering
 the ones you want.
 []s

 From: user@cassandra.apache.org
 Subject: Re:query by column size

 Greetings,

 I have one column family with 10 columns,  one of the column we store
 xml/json.
 Is there a way I can query  that column where size  50kb  ?  assuming I
  have index on that column.

 thanks
 CV.





query by column size

2015-02-12 Thread chandra Varahala
Greetings,

I have one column family with 10 columns,  one of the column we store
xml/json.
Is there a way I can query  that column where size  50kb  ?  assuming I
 have index on that column.

thanks
CV.


Re: Moving from relational to Cassandra, how to handle intra-table relationships?

2014-01-22 Thread chandra Varahala
Hello,

You can implement relations  in couple of ways, JSON/XML and CQL collection
Classes.

Thanks
Chandra


On Tue, Jan 21, 2014 at 8:58 PM, Les Hartzman lhartz...@gmail.com wrote:

 True. Fortunately though in this application, the data is
 write-once/read-many. So that is one bullet I would dodge!

 Les


 On Tue, Jan 21, 2014 at 5:34 PM, Patricia Gorla 
 patri...@thelastpickle.com wrote:

 Hey,

 One thing to keep in mind if you want to go the serialized JSON route, is
 that you will need to read out the data each time you want to do an update.

 Cheers,
 Patricia


 On Tuesday, January 21, 2014, Les Hartzman lhartz...@gmail.com wrote:

 Hi,

 I'm looking to move from a relational DB to Cassandra. I just found that
 there are intra-table relationships in one table where the ids of the
 related rows are saved in a 'parent' row.

 How can these kinds of relationships be handled in Cassandra? I'm
 thinking that if the individual rows need to live on their own, perhaps I
 should store the data as serialized JSON in its own column of the parent.

 All thoughts appreciated!

 Thanks.

 Les



 --
 Patricia Gorla
 @patriciagorla

 Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com http://thelastpickle.com





Re: Help me to find Compatable JDBC jar for Apache Cassandra 2.0.4

2014-01-22 Thread chandra Varahala
Did you put these jars in classpath ?

cassandra-all-1.x.x.jar
guarva
jackson-core-asl
jacckson-mapper-asl
libthrift
snappy
slf4j-api
metrics-core
netty


thanks
Chandra


On Wed, Jan 22, 2014 at 12:52 PM, Colin Clark co...@clark.ws wrote:

 Is the jar on the path?  Is cassandra home set correctly?

 Looks like cassandra cant find the jar-verify existence by searching.

 --
 Colin
 +1 320 221 9531



 On Jan 22, 2014, at 11:50 AM, Chiranjeevi Ravilla rccassandr...@gmail.com
 wrote:

 Hi  All,

 I am using Apache Cassandra 2.0.4  version with cassandra-jdbc-1.2.5.jar.I
 am trying to run sample java program. aim getting below error. Please
 correct me weather i am using right Jdbc driver or suggest me  supported
 jdbc driver.


 log4j:WARN No appenders could be found for logger
 (org.apache.cassandra.cql.jdbc.CassandraDriver).
 log4j:WARN Please initialize the log4j system properly.
 log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for
 more info.
 Exception in thread main java.lang.NoClassDefFoundError:
 org/apache/cassandra/cql/jdbc/AbstractJdbcType
 at
 org.apache.cassandra.cql.jdbc.CassandraConnection.init(CassandraConnection.java:146)
  at
 org.apache.cassandra.cql.jdbc.CassandraDriver.connect(CassandraDriver.java:92)
 at java.sql.DriverManager.getConnection(DriverManager.java:571)
  at java.sql.DriverManager.getConnection(DriverManager.java:233)
 at CassandraTest.main(CassandraTest.java:12)
 Caused by: java.lang.ClassNotFoundException:
 org.apache.cassandra.cql.jdbc.AbstractJdbcType
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
  at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
 at java.security.AccessController.doPrivileged(Native Method)
  at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
  at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358)


 Thanks in Advance..

 --Chiru




Re: AddContractPoint /VIP

2014-01-18 Thread chandra Varahala
Thank you.

-Chandra


On Wed, Dec 11, 2013 at 9:06 PM, Aaron Morton aa...@thelastpickle.comwrote:

 What is the good practice to put in the code as addContactPoint ie.,how
 many servers ?

 I use the same nodes as the seed list nodes for that DC.

 The idea of the seed list is that it’s a list of well known nodes, and
 it’s easier operationally to say we have one list of well known nodes that
 is used by the servers and the clients.

 1) I am also thinking to put this way here   I am not sure this good or
 bad if i conigure   4 serves into one VIP ( virtual IP/virtual DNS)
 and specifying that DSN in the code as ContactPoint,  so that that VIP is
 smart enough to route to different nodes.

 Too complicated.

 2) Is that problem if i use multiple Data centers in future ?

 You only need to give the client the local seeds, it will discover all the
 nodes.

 Cheers

 -
 Aaron Morton
 New Zealand
 @aaronmorton

 Co-Founder  Principal Consultant
 Apache Cassandra Consulting
 http://www.thelastpickle.com

 On 7/12/2013, at 7:12 am, chandra Varahala hadoopandcassan...@gmail.com
 wrote:

 Greetings,

 I have 4 node cassandra cluster that will grow upt to 10 nodes,we are
 using  CQL  Java client to access the  data.
 What is the good practice to put in the code as addContactPoint ie.,how
 many servers ?

 1) I am also thinking to put this way here   I am not sure this good or
 bad if i conigure   4 serves into one VIP ( virtual IP/virtual DNS)
 and specifying that DSN in the code as ContactPoint,  so that that VIP is
 smart enough to route to different nodes.

 2) Is that problem if i use multiple Data centers in future ?


 thanks
 Chandra





data export with different replication factor.

2014-01-18 Thread chandra Varahala
Greetings,

I have 6 node cluster production cluster with replication factor of 3 with
4  keyspaces, and 1 Test cluster with 2 nodes , is there a way I can export
data from production cluster  and copy  into test cluster with replication
factor 1 ?

Thanks
Chandra


AddContractPoint /VIP

2013-12-06 Thread chandra Varahala
Greetings,

I have 4 node cassandra cluster that will grow upt to 10 nodes,we are using
 CQL  Java client to access the  data.
What is the good practice to put in the code as addContactPoint ie.,how
many servers ?

1) I am also thinking to put this way here   I am not sure this good or bad
if i conigure   4 serves into one VIP ( virtual IP/virtual DNS)
and specifying that DSN in the code as ContactPoint,  so that that VIP is
smart enough to route to different nodes.

2) Is that problem if i use multiple Data centers in future ?


thanks
Chandra


replica verification

2013-12-02 Thread chandra Varahala
Hello Team,


I have cassandra cluster with 5 nodes with 1 replication factor initially.
Now i changed to replication factor 3 and ran nodetool repair.
Is there way i can verify that i have 3 replicas ?


Thanks
Chandra


merge sstables

2013-07-11 Thread chandra Varahala
Hello ,

 I have small size of sstables like 5mb around 2000 files. Is there a way i
can merge into  bigger size ?

thanks
chandra


Re: merge sstables

2013-07-11 Thread chandra Varahala
yes, but nodetool  scrub is not working ..


thanks
chandra


On Thu, Jul 11, 2013 at 2:39 PM, Faraaz Sareshwala 
fsareshw...@quantcast.com wrote:

 I assume you are using the leveled compaction strategy because you have 5mb
 sstables and 5mb is the default size for leveled compaction.

 To change this default, you can run the following in the cassandra-cli:

 update column family cf_name with compaction_strategy_options =
 {sstable_size_in_mb: 256};

 To force the current sstables to be rewritten, I think you'll need to
 issue a
 nodetool scrub on each node. Someone please correct me if I'm wrong on
 this.

 Faraaz

 On Thu, Jul 11, 2013 at 11:34:08AM -0700, chandra Varahala wrote:
  Hello ,
 
   I have small size of sstables like 5mb around 2000 files. Is there a
 way i can
  merge into  bigger size ?
 
  thanks
  chandra



Re: JNA not found.

2013-01-29 Thread chandra Varahala
I think you need Jna  jar and  jna-plaform jar in  cassandra lib folder

-chandra



On Mon, Jan 28, 2013 at 10:02 PM, Tim Dunphy bluethu...@gmail.com wrote:

 I went to github to try to download jna again. I downloaded version 3.5.1

 [root@cassandra-node01 cassandrahome]# ls -l lib/jna-3.5.1.jar
 -rw-r--r-- 1 root root 692603 Jan 28 21:57 lib/jna-3.5.1.jar

 I noticed in the datastax docs that java 7 was not recommended so I
 downgraded to java 6

 [root@cassandra-node01 cassandrahome]# java -version
 java version 1.6.0_34
 Java(TM) SE Runtime Environment (build 1.6.0_34-b04)
 Java HotSpot(TM) 64-Bit Server VM (build 20.9-b04, mixed mode)

 And now if I try to start cassandra with that library it fails with this
 message:

 [root@cassandra-node01 cassandrahome]# ./bin/cassandra -f
 xss =  -ea -javaagent:/etc/alternatives/cassandrahome/lib/jamm-0.2.5.jar
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms295M -Xmx295M
 -Xmn73M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
  INFO 22:00:14,318 Logging initialized
  INFO 22:00:14,333 JVM vendor/version: Java HotSpot(TM) 64-Bit Server
 VM/1.6.0_34
  INFO 22:00:14,334 Heap size: 301727744/302776320
  INFO 22:00:14,334 Classpath:
 /etc/alternatives/cassandrahome/conf:/etc/alternatives/cassandrahome/build/classes/main:/etc/alternatives/cassandrahome/build/classes/thrift:/etc/alternatives/cassandrahome/lib/antlr-3.2.jar:/etc/alternatives/cassandrahome/lib/apache-cassandra-1.2.1.jar:/etc/alternatives/cassandrahome/lib/apache-cassandra-clientutil-1.2.1.jar:/etc/alternatives/cassandrahome/lib/apache-cassandra-thrift-1.2.1.jar:/etc/alternatives/cassandrahome/lib/avro-1.4.0-fixes.jar:/etc/alternatives/cassandrahome/lib/avro-1.4.0-sources-fixes.jar:/etc/alternatives/cassandrahome/lib/commons-cli-1.1.jar:/etc/alternatives/cassandrahome/lib/commons-codec-1.2.jar:/etc/alternatives/cassandrahome/lib/commons-lang-2.6.jar:/etc/alternatives/cassandrahome/lib/compress-lzf-0.8.4.jar:/etc/alternatives/cassandrahome/lib/concurrentlinkedhashmap-lru-1.3.jar:/etc/alternatives/cassandrahome/lib/guava-13.0.1.jar:/etc/alternatives/cassandrahome/lib/high-scale-lib-1.1.2.jar:/etc/alternatives/cassandrahome/lib/jackson-core-asl-1.9.2.jar:/etc/alternatives/cassandrahome/lib/jackson-mapper-asl-1.9.2.jar:/etc/alternatives/cassandrahome/lib/jamm-0.2.5.jar:/etc/alternatives/cassandrahome/lib/jline-1.0.jar:/etc/alternatives/cassandrahome/lib/jna-3.5.1.jar:/etc/alternatives/cassandrahome/lib/json-simple-1.1.jar:/etc/alternatives/cassandrahome/lib/libthrift-0.7.0.jar:/etc/alternatives/cassandrahome/lib/log4j-1.2.16.jar:/etc/alternatives/cassandrahome/lib/metrics-core-2.0.3.jar:/etc/alternatives/cassandrahome/lib/netty-3.5.9.Final.jar:/etc/alternatives/cassandrahome/lib/servlet-api-2.5-20081211.jar:/etc/alternatives/cassandrahome/lib/slf4j-api-1.7.2.jar:/etc/alternatives/cassandrahome/lib/slf4j-log4j12-1.7.2.jar:/etc/alternatives/cassandrahome/lib/snakeyaml-1.6.jar:/etc/alternatives/cassandrahome/lib/snappy-java-1.0.4.1.jar:/etc/alternatives/cassandrahome/lib/snaptree-0.1.jar:/etc/alternatives/cassandrahome/lib/jamm-0.2.5.jar
 Killed

 I move the library back out of the lib directory and cassandra starts
 again albeit without JNA working quite naturally.


 Both my cassandra and java installs are tarball installs.

 Thanks
 Tim


 On Mon, Jan 28, 2013 at 6:29 PM, Tim Dunphy bluethu...@gmail.com wrote:

 Hey List,

  I just downloaded 1.21 and have set it up across my cluster. When I
 noticed the following notice:

  INFO 18:14:53,828 JNA not found. Native methods will be disabled.

 So I downloaded jna.jar from git hub and moved it to the cassandra /lib
 directory. I changed mod to 755 as per the datastax docs. I've also tried
 installing the jna package (via yum, I am using centos 6.2). Nothing seems
 to do the trick, I keep getting this message. What can I do to get
 cassandra 1.2.1 to recognize JNA?

 Thanks
 Tim
 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B




 --
 GPG me!!

 gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B




Re: JNA not found.

2013-01-29 Thread chandra Varahala
we had this issue before, but after adding those two jar the  error gone.
We used 1.0.8 cassandra (JNA 3.3.0, JNA platform. 3.3.0).  what version
 cassnadra  you are using ?

-chandra


On Tue, Jan 29, 2013 at 12:19 PM, Tim Dunphy bluethu...@gmail.com wrote:

 Hi Chandra,

 Thanks for your reply. Well I have added both jna.jar and platform.jar to
 my lib directory (jna 3.3.0):

 [root@cassandra-node01 cassandrahome]# ls -l lib/jna.jar lib/platform.jar
 -rw-r--r-- 1 root root 865400 Jan 29 12:14 lib/jna.jar
 -rw-r--r-- 1 root root 841291 Jan 29 12:14 lib/platform.jar

 But sadly I get the same result:


 [root@cassandra-node01 cassandrahome]# ./bin/cassandra -f
 xss =  -ea -javaagent:/etc/alternatives/cassandrahome/lib/jamm-0.2.5.jar
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms295M -Xmx295M
 -Xmn73M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
  INFO 12:14:52,493 Logging initialized
  INFO 12:14:52,507 JVM vendor/version: Java HotSpot(TM) 64-Bit Server
 VM/1.6.0_34
  INFO 12:14:52,507 Heap size: 301727744/302776320
  INFO 12:14:52,508 Classpath:
 /etc/alternatives/cassandrahome/conf:/etc/alternatives/cassandrahome/build/classes/main:/etc/alternatives/cassandrahome/build/classes/thrift:/etc/alternatives/cassandrahome/lib/antlr-3.2.jar:/etc/alternatives/cassandrahome/lib/apache-cassandra-1.2.1.jar:/etc/alternatives/cassandrahome/lib/apache-cassandra-clientutil-1.2.1.jar:/etc/alternatives/cassandrahome/lib/apache-cassandra-thrift-1.2.1.jar:/etc/alternatives/cassandrahome/lib/avro-1.4.0-fixes.jar:/etc/alternatives/cassandrahome/lib/avro-1.4.0-sources-fixes.jar:/etc/alternatives/cassandrahome/lib/commons-cli-1.1.jar:/etc/alternatives/cassandrahome/lib/commons-codec-1.2.jar:/etc/alternatives/cassandrahome/lib/commons-lang-2.6.jar:/etc/alternatives/cassandrahome/lib/compress-lzf-0.8.4.jar:/etc/alternatives/cassandrahome/lib/concurrentlinkedhashmap-lru-1.3.jar:/etc/alternatives/cassandrahome/lib/guava-13.0.1.jar:/etc/alternatives/cassandrahome/lib/high-scale-lib-1.1.2.jar:/etc/alternatives/cassandrahome/lib/jackson-core-asl-1.9.2.jar:/etc/alternatives/cassandrahome/lib/jackson-mapper-asl-1.9.2.jar:/etc/alternatives/cassandrahome/lib/jamm-0.2.5.jar:/etc/alternatives/cassandrahome/lib/jline-1.0.jar:/etc/alternatives/cassandrahome/lib/jna.jar:/etc/alternatives/cassandrahome/lib/json-simple-1.1.jar:/etc/alternatives/cassandrahome/lib/libthrift-0.7.0.jar:/etc/alternatives/cassandrahome/lib/log4j-1.2.16.jar:/etc/alternatives/cassandrahome/lib/metrics-core-2.0.3.jar:/etc/alternatives/cassandrahome/lib/netty-3.5.9.Final.jar:/etc/alternatives/cassandrahome/lib/platform.jar:/etc/alternatives/cassandrahome/lib/servlet-api-2.5-20081211.jar:/etc/alternatives/cassandrahome/lib/slf4j-api-1.7.2.jar:/etc/alternatives/cassandrahome/lib/slf4j-log4j12-1.7.2.jar:/etc/alternatives/cassandrahome/lib/snakeyaml-1.6.jar:/etc/alternatives/cassandrahome/lib/snappy-java-1.0.4.1.jar:/etc/alternatives/cassandrahome/lib/snaptree-0.1.jar:/etc/alternatives/cassandrahome/lib/jamm-0.2.5.jar
 Killed

 And still when I remove those library files cassandra starts without a
 problem exception the fact that it is not able to use JNA.

 I'd appreciate any input the list might have.

 Thanks
 Tim


 On Tue, Jan 29, 2013 at 8:54 AM, chandra Varahala 
 hadoopandcassan...@gmail.com wrote:

 I think you need Jna  jar and  jna-plaform jar in  cassandra lib folder

 -chandra



 On Mon, Jan 28, 2013 at 10:02 PM, Tim Dunphy bluethu...@gmail.comwrote:

 I went to github to try to download jna again. I downloaded version 3.5.1

 [root@cassandra-node01 cassandrahome]# ls -l lib/jna-3.5.1.jar
 -rw-r--r-- 1 root root 692603 Jan 28 21:57 lib/jna-3.5.1.jar

 I noticed in the datastax docs that java 7 was not recommended so I
 downgraded to java 6

 [root@cassandra-node01 cassandrahome]# java -version
 java version 1.6.0_34
 Java(TM) SE Runtime Environment (build 1.6.0_34-b04)
 Java HotSpot(TM) 64-Bit Server VM (build 20.9-b04, mixed mode)

 And now if I try to start cassandra with that library it fails with this
 message:

 [root@cassandra-node01 cassandrahome]# ./bin/cassandra -f
 xss =  -ea -javaagent:/etc/alternatives/cassandrahome/lib/jamm-0.2.5.jar
 -XX:+UseThreadPriorities -XX:ThreadPriorityPolicy=42 -Xms295M -Xmx295M
 -Xmn73M -XX:+HeapDumpOnOutOfMemoryError -Xss180k
  INFO 22:00:14,318 Logging initialized
  INFO 22:00:14,333 JVM vendor/version: Java HotSpot(TM) 64-Bit Server
 VM/1.6.0_34
  INFO 22:00:14,334 Heap size: 301727744/302776320
  INFO 22:00:14,334 Classpath:
 /etc/alternatives/cassandrahome/conf:/etc/alternatives/cassandrahome/build/classes/main:/etc/alternatives/cassandrahome/build/classes/thrift:/etc/alternatives/cassandrahome/lib/antlr-3.2.jar:/etc/alternatives/cassandrahome/lib/apache-cassandra-1.2.1.jar:/etc/alternatives/cassandrahome/lib/apache-cassandra-clientutil-1.2.1.jar:/etc/alternatives/cassandrahome/lib/apache-cassandra-thrift-1.2.1.jar:/etc/alternatives/cassandrahome/lib/avro-1.4.0-fixes.jar:/etc/alternatives/cassandrahome/lib

Re: Denormalization

2013-01-28 Thread chandra Varahala
My experience we can design main column families  and lookup column
families.
Main column family have all denormalized data,lookup column  families have
rowkey of denormalized column families's column.

In users column family  all user's  denormalized data and  lookup column
family name like  userByemail.
when i first make request to userByemail retuns unique key which is rowkey
of User column family then call to User column family returns all data,
same other lookup column families too.

-
Chandra



On Sun, Jan 27, 2013 at 8:53 PM, Hiller, Dean dean.hil...@nrel.gov wrote:

 Agreed, was just making sure others knew ;).

 Dean

 From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
 
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Date: Sunday, January 27, 2013 6:51 PM
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Denormalization

 When I said that writes were cheap, I was speaking that in a normal case
 people are making 2-10 inserts what in a relational database might be one.
 30K inserts is certainly not cheap.

 Your use case with 30,000 inserts is probably a special case. Most
 directory services that I am aware of OpenLDAP, Active Directory, Sun
 Directory server do eventually consistent master/slave and multi-master
 replication. So no worries about having to background something. You just
 want the replication to be fast enough so that when you call the employee
 about to be fired into the office, that by the time he leaves and gets home
 he can not VPN to rm -rf / your main file server :)


 On Sun, Jan 27, 2013 at 7:57 PM, Hiller, Dean dean.hil...@nrel.gov
 mailto:dean.hil...@nrel.gov wrote:
 Sometimes this is true, sometimes not…..….We have a use case where we have
 an admin tool where we choose to do this denorm for ACL on permission
 checks to make permission checks extremely fast.  That said, we have one
 issue with one object that too many children(30,000) so when someone gives
 a user access to this one object with 30,000 children, we end up with a bad
 60 second wait and users ended up getting frustrated and trying to
 cancel(our plan since admin activity hardly ever happens is to do it on our
 background thread and just return immediately to the user and tell him his
 changes will take affect in 1 minute ).  After all, admin changes are
 infrequent anyways.  This example demonstrates how sometimes it could
 almost burn you.

 I guess my real point is it really depends on your use cases ;).  In a lot
 of cases denorm can work but in some cases it burns you so you have to
 balance it all.  In 90% of our cases our denorm is working great and for
 this one case, we need to background the permission change as we still LOVE
 the performance of our ACL checks.

 Ps. 30,000 writes in cassandra is not cheap when done from one server ;)
 but in general parallized writes is very fast for like 500.

 Later,
 Dean

 From: Edward Capriolo edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
 mailto:edlinuxg...@gmail.commailto:edlinuxg...@gmail.com
 Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org
 mailto:user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Date: Sunday, January 27, 2013 5:50 PM
 To: user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org 
 user@cassandra.apache.orgmailto:user@cassandra.apache.orgmailto:
 user@cassandra.apache.orgmailto:user@cassandra.apache.org
 Subject: Re: Denormalization

 One technique is on the client side you build a tool that takes the even
 and produces N mutations. In c* writes are cheap so essentially, re-write
 everything on all changes.

 On Sun, Jan 27, 2013 at 4:03 PM, Fredrik Stigbäck 
 fredrik.l.stigb...@sitevision.semailto:fredrik.l.stigb...@sitevision.se
 mailto:fredrik.l.stigb...@sitevision.semailto:
 fredrik.l.stigb...@sitevision.se wrote:
 Hi.
 Since denormalized data is first-class citizen in Cassandra, how to
 handle updating denormalized data.
 E.g. If we have  a USER cf with name, email etc. and denormalize user
 data into many other CF:s and then
 update the information about a user (name, email...). What is the best
 way to handle updating those user data properties
 which might be spread out over many cf:s and many rows?

 Regards
 /Fredrik





Re: BulkOutputFormat

2013-01-17 Thread chandra Varahala
I am not reducers, just Map only job still same kind issue ?

thanks
chandra


On Thu, Jan 17, 2013 at 1:50 PM, Michael Kjellman
mkjell...@barracuda.comwrote:

 https://issues.apache.org/jira/browse/CASSANDRA-4813

 Fixed in 1.2.0

 Best,
 michael

 From: chandra Varahala hadoopandcassan...@gmail.com
 Reply-To: user@cassandra.apache.org user@cassandra.apache.org
 Date: Thursday, January 17, 2013 10:39 AM
 To: user@cassandra.apache.org user@cassandra.apache.org
 Subject: BulkOutputFormat

 Hello,

 I am facing issues with Bulkoutputformat loading data from hadoop to
 cassandra.

 Cluster details :

 we have 15 nodes in Hadoop cluster, 2 nodes  in cassandra   - QA and we
 have
 150 hadoop,   10  nodes in Cassandra   production environment

 Two  cassandra clusters Random order and Byte order on same machine with
 different ports.

 issues  1

 i can load small amount(1G) of  data in random cluster. But more than 1GB
 throwing below error
 :
 java.io.IOException: Too many hosts failed: [/172.20.128.48, /
 172.20.128.49]
 at
 org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:243)
 at
 org.apache.cassandra.hadoop.BulkRecordWriter.close(BulkRecordWriter.java:208)
 at
 org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:540)
 at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:649)
 at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323)
 at org.apache.hadoop.mapred.Child$4.run(Child.java:270)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:396)
 at
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
 at org.apache.hadoop.mapred.Child.main(Child.java:264)

 attempt_201301100944_4262_m_00_0: Exception in thread Streaming to /
 172.20.138.48:1 java.lang.RuntimeException: java.io.EOFException


 ISSUE -2:

  I  am not able to load  data  in Byte order cluster:

 same above error.


 Please help.

 Chandra