Re: Cassandra 2.1.2, Pig 0.14, Hadoop 2.6.0 does not work together

2015-01-23 Thread Pinak Pani
Thanks Dave. I found that Pig 0.14 and Hadoop 2.6.0 still use Guava 11.x
which was causing issue. I replacing all of those locations with Guava 17
did not end the ordeal. Seems like Guava made some breaking changes (
https://issues.apache.org/jira/browse/HADOOP-11032) in v17. You need
version 16.0 to be precise to get things working.

The MR job was still failing with ClassDefNotFound because it could not
find  com.codahale.metrics.Metric in classpath, so I had to download and
include that. The example did run finally.

However, I still *cannot* use CqlStorage(), I had to rely on
CqlNativeStorage().

Anyway, thanks again.

On Fri, Jan 23, 2015 at 6:28 AM, Dave Brosius dbros...@mebigfatguy.com
wrote:

  The method

 com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;

 should be available in guava from 15.0 on. So guava-16.0 should be fine.

 It's possible guava is being picked up from somewhere else? have a global
 classpath variable?

 you might want to do

 URL u = YourClass.getResource(/com/google/common/collect/Sets.class);
 System.out.println(u);

 to see where you are loading guava from.


 On 01/22/2015 04:12 AM, Pinak Pani wrote:

 I am using Pig with Cassandra (Cassandra 2.1.2, Pig 0.14, Hadoop 2.6.0
 combo).

  When I use CqlStorage() I get

  org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
 org.apache.cassandra.exceptions.ConfigurationException: Unable to find
 inputformat class 'org.apache.cassandra.hadoop.cql3.CqlPagingInputFormat/

  When I use CqlNativeStorage() I get

  java.lang.NoSuchMethodError:
 com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;

  Pig classpath looks like this:

  » echo $PIG_CLASSPATH


 /home/naishe/apps/apache-cassandra-2.1.2/lib/airline-0.6.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/antlr-runtime-3.5.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/apache-cassandra-2.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/apache-cassandra-clientutil-2.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/apache-cassandra-thrift-2.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-cli-1.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-codec-1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-lang3-3.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-math3-3.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/compress-lzf-0.8.4.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/concurrentlinkedhashmap-lru-1.4.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/disruptor-3.0.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/
 *guava-16.0.jar*:/home/naishe/apps/apache-cassandra-2.1 .
 2/lib/high-scale-lib-1.0.6.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jackson-core-asl-1.9.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jackson-mapper-asl-1.9.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jamm-0.2.8.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/javax.inject.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jbcrypt-0.3m.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jline-1.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jna-4.0.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/json-simple-1.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/libthrift-0.9.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/logback-classic-1.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/logback-core-1.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/lz4-1.2.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/metrics-core-2.2.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/netty-all-4.0.23.Final.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/report
 e
 r-config-2.1.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/slf4j-api-1.7.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/snakeyaml-1.11.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/snappy-java-1.0.5.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/stream-2.5.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/stringtemplate-4.0.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/super-csv-2.1.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/thrift-server-0.3.7.jar::/home/naishe/.m2/repository/com/datastax/cassandra/cassandra-driver-core/2.1.2/cassandra-driver-core-2.1.2.jar:/home/naishe/.m2/repository/org/apache/cassandra/cassandra-all/2.1.2/cassandra-all-2.1.2.jar

  I have read somewhere that it is due to version conflict with Guava
 library. So, I tried using Guava 11.0.2, that did not help. (
 http://stackoverflow.com/questions/27089126/nosuchmethoderror-sets-newconcurrenthashset-while-running-jar-using-hadoop#comment42687234_27089126
 )

  Here is the Pig latin that I was trying to execute.

  grunt alice = LOAD 'cql://hadoop_test/lines' USING CqlNativeStorage();

 2015-01-22 09:28:54,133 [main] INFO
  org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is
 deprecated. Instead, use fs.defaultFS
 grunt B = foreach alice generate flatten(TOKENIZE((chararray)$0)) as word;
 grunt C

Cassandra 2.1.2, Pig 0.14, Hadoop 2.6.0 does not work together

2015-01-22 Thread Pinak Pani
I am using Pig with Cassandra (Cassandra 2.1.2, Pig 0.14, Hadoop 2.6.0
combo).

When I use CqlStorage() I get

org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
org.apache.cassandra.exceptions.ConfigurationException: Unable to find
inputformat class 'org.apache.cassandra.hadoop.cql3.CqlPagingInputFormat/

When I use CqlNativeStorage() I get

java.lang.NoSuchMethodError:
com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;

Pig classpath looks like this:

» echo $PIG_CLASSPATH

/home/naishe/apps/apache-cassandra-2.1.2/lib/airline-0.6.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/antlr-runtime-3.5.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/apache-cassandra-2.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/apache-cassandra-clientutil-2.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/apache-cassandra-thrift-2.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-cli-1.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-codec-1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-lang3-3.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/commons-math3-3.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/compress-lzf-0.8.4.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/concurrentlinkedhashmap-lru-1.4.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/disruptor-3.0.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/
*guava-16.0.jar*
:/home/naishe/apps/apache-cassandra-2.1.2/lib/high-scale-lib-1.0.6.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jackson-core-asl-1.9.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jackson-mapper-asl-1.9.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jamm-0.2.8.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/javax.inject.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jbcrypt-0.3m.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jline-1.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/jna-4.0.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/json-simple-1.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/libthrift-0.9.1.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/logback-classic-1.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/logback-core-1.1.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/lz4-1.2.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/metrics-core-2.2.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/netty-all-4.0.23.Final.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/reporter-config-2.1.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/slf4j-api-1.7.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/snakeyaml-1.11.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/snappy-java-1.0.5.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/stream-2.5.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/stringtemplate-4.0.2.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/super-csv-2.1.0.jar:/home/naishe/apps/apache-cassandra-2.1.2/lib/thrift-server-0.3.7.jar::/home/naishe/.m2/repository/com/datastax/cassandra/cassandra-driver-core/2.1.2/cassandra-driver-core-2.1.2.jar:/home/naishe/.m2/repository/org/apache/cassandra/cassandra-all/2.1.2/cassandra-all-2.1.2.jar

I have read somewhere that it is due to version conflict with Guava
library. So, I tried using Guava 11.0.2, that did not help. (
http://stackoverflow.com/questions/27089126/nosuchmethoderror-sets-newconcurrenthashset-while-running-jar-using-hadoop#comment42687234_27089126
)

Here is the Pig latin that I was trying to execute.

grunt alice = LOAD 'cql://hadoop_test/lines' USING CqlNativeStorage();
2015-01-22 09:28:54,133 [main] INFO
 org.apache.hadoop.conf.Configuration.deprecation - fs.default.name is
deprecated. Instead, use fs.defaultFS
grunt B = foreach alice generate flatten(TOKENIZE((chararray)$0)) as word;
grunt C = group B by word;
grunt D = foreach C generate COUNT(B) as word_count, group as word;
grunt dump D;
2015-01-22 09:29:06,808 [main] INFO
 org.apache.pig.tools.pigstats.ScriptState - Pig features used in the
script: GROUP_BY
[ -- snip -- ]
2015-01-22 09:29:11,254 [LocalJobRunner Map Task Executor #0] INFO
 org.apache.hadoop.mapred.MapTask - Map output collector class =
org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2015-01-22 09:29:11,588 [LocalJobRunner Map Task Executor #0] INFO
 org.apache.hadoop.mapred.MapTask - Starting flush of map output
2015-01-22 09:29:11,600 [Thread-22] INFO
 org.apache.hadoop.mapred.LocalJobRunner - map task executor complete.
2015-01-22 09:29:11,620 [Thread-22] WARN
 org.apache.hadoop.mapred.LocalJobRunner - job_local1857630817_0001
java.lang.Exception: java.lang.NoSuchMethodError:
com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;
at
org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NoSuchMethodError:
com.google.common.collect.Sets.newConcurrentHashSet()Ljava/util/Set;
at

A trigger that modifies the current Mutation

2014-09-28 Thread Pinak Pani
Hi,

I wanted to create a trigger that alters the current mutation. For example,
I wanted to, say, iterate through the ColumnFamily in augment method and
look for all the fields that are of type text or varchar and change them to
upper case. I am not sure how to do that. Can someone help me?

Basically, this is what I wanted to do:

public class AllCapsTrigger implements ITrigger {

  public CollectionMutation augment(ByteBuffer key, ColumnFamily cf) {
for(Cell cell: cf){
  if(cell.value().hasRemaining()){
System.out.println(Value:   + new String(cell.value().array(),
StandardCharsets.UTF_8));

/**
 * Check if cell is of type text/varchar
 * Set cell value to upper case of what it has
 **/
  }
}

return null;
  }
}

Thanks for reading this.

Regards,
Pinak


Authentication is failing.

2014-09-28 Thread Pinak Pani
Hi,

I have been toying around with CQL. I realized when I GRANT SELECT I lose
authentication. Here is the process: Can someone point out what is wrong?

➜  apache-cassandra-2.1.0  bin/cqlsh -u cassandra -p cassandra

Connected to Test Cluster at 127.0.0.1:9042.
[cqlsh 5.0.1 | Cassandra 2.1.0 | CQL spec 3.2.0 | Native protocol v3]
Use HELP for help.

cqlsh CREATE USER testuser WITH PASSWORD 'abc';
cqlsh GRANT SELECT ON demo_cql.grant_test TO testuser;
cqlsh exit
➜  apache-cassandra-2.1.0  bin/cqlsh -u testuser -p abc -k demo_cql -e
'select * from grant_test'
Connection error: ('Unable to connect to any servers', {'127.0.0.1':
Unauthorized(u'code=2100 [Unauthorized] message=User testuser has no
SELECT permission on table system.schema_triggers or any of its
parents',)})

Thanks,
Pinak


Re: Authentication is failing.

2014-09-28 Thread Pinak Pani
Hi Jens,

 Just making sure, have you set authenticator and authoriser in
cassandra.yml?

Yes.

Hi Philip,

I guess you are right. I will test that tomorrow morning, and confirm.

Thanks for the help guys.

On Mon, Sep 29, 2014 at 2:10 AM, Philip Thompson 
philip.thomp...@datastax.com wrote:

 You are running into https://issues.apache.org/jira/browse/CASSANDRA-7967
 . This is fixed in 2.0.11 and 2.1.1, until then I believe you will need to
 explicitly grant select permission onto system.schema_triggers to the user
 as a workaround.

 On Sun, Sep 28, 2014 at 12:53 PM, Jens Rantil jens.ran...@tink.se wrote:

 Hi Pinak,

 Just making sure, have you set authenticator and authoriser in
 cassandra.yml?

 Cheers,
 Jens

 ——— Jens Rantil Backend engineer Tink AB Email: jens.ran...@tink.se
 Phone: +46 708 84 18 32 Web: www.tink.se Facebook Linkedin Twitter


 On Sun, Sep 28, 2014 at 11:08 AM, Pinak Pani 
 nishant.has.a.quest...@gmail.com wrote:

 Hi,

 I have been toying around with CQL. I realized when I GRANT SELECT I
 lose authentication. Here is the process: Can someone point out what is
 wrong?

  ➜  apache-cassandra-2.1.0  bin/cqlsh -u cassandra -p cassandra

 Connected to Test Cluster at 127.0.0.1:9042.
 [cqlsh 5.0.1 | Cassandra 2.1.0 | CQL spec 3.2.0 | Native protocol v3]
 Use HELP for help.

  cqlsh CREATE USER testuser WITH PASSWORD 'abc';
 cqlsh GRANT SELECT ON demo_cql.grant_test TO testuser;
 cqlsh exit
 ➜  apache-cassandra-2.1.0  bin/cqlsh -u testuser -p abc -k demo_cql -e
 'select * from grant_test'
 Connection error: ('Unable to connect to any servers', {'127.0.0.1':
 Unauthorized(u'code=2100 [Unauthorized] message=User testuser has no
 SELECT permission on table system.schema_triggers or any of its
 parents',)})

 Thanks,
 Pinak






How to create counter column family via Pycassa?

2013-08-15 Thread Pinak Pani
I do not find a way to create a counter column family in Pycassa.
This[1] does not help.

Appreciate if someone can help me.

Thanks

1.
http://pycassa.github.io/pycassa/api/pycassa/system_manager.html#pycassa.system_manager.SystemManager.create_column_family


Re: How to create counter column family via Pycassa?

2013-08-15 Thread Pinak Pani
Thanks for quick reply. Apparantly, I was trying this to get working

cf_kwargs = {'default_validation_class':COUNTER_COLUMN_TYPE}
sys.create_column_family('my_ks', 'vote_count',
column_validation_classes=cf_kwargs)  #1

But this works:

sys.create_column_family('my_ks', 'vote_count', **cf_kwargs)  #2

I thought #1 should work.



On Thu, Aug 15, 2013 at 9:15 PM, Tyler Hobbs ty...@datastax.com wrote:

 The only thing that makes a CF a counter CF is that the default validation
 class is CounterColumnType, which you can set through
 SystemManager.create_column_family().


 On Thu, Aug 15, 2013 at 10:38 AM, Pinak Pani 
 nishant.has.a.quest...@gmail.com wrote:

 I do not find a way to create a counter column family in Pycassa.
 This[1] does not help.

 Appreciate if someone can help me.

 Thanks

  1.
 http://pycassa.github.io/pycassa/api/pycassa/system_manager.html#pycassa.system_manager.SystemManager.create_column_family




 --
 Tyler Hobbs
 DataStax http://datastax.com/



Re: In a multiple data center setup, do all the data centers have complete data irrespective of RF?

2013-05-20 Thread Pinak Pani
Assume NetworkTopologyStrategy. So, I wanted to know whether a data-center
will contain all the keys?

This is the case:

CREATE KEYSPACE appKS
  WITH placement_strategy = 'NetworkTopologyStrategy'
  AND strategy_options={DC1:3, DC2:3};

Does DC1 and DC2 each contain complete database corpus? That is, if DC1
blows, will I get all the data from DC2? Assume RF = 1.

Sorry, for the very elementary question. This is the post that made me ask
this question:
http://www.onsip.com/blog/2011/07/15/intro-to-cassandra-and-networktopologystrategy

It says,

NTS creates an iterator for EACH datacenter and places writes discretely
for each. The result is that NTS basically breaks each datacenter into it's
own logical ring when it places writes.

That seems to mean that each data-center behaves as an independent ring
with initial_token. So, If I have 2 data centers and NTS, I am basically
mirroring the database. Right?

Thanks,
PP


Re: In a multiple data center setup, do all the data centers have complete data irrespective of RF?

2013-05-20 Thread Pinak Pani
Great! thanks Bryan. I seemed to confused by logical ring within each
data-center. Actually, if I focus on RF that was configured it is all
obvious. :)

Thanks for your time,
PP