Hi All,

It looks it is know issue with Cassandra-0.8.4. So either I have to wait till 
0.8.5 to be released or have to switch to 0.7.8 if this has been resolved in 
that.
Ref: https://issues.apache.org/jira/browse/CASSANDRA-3044

Regards,

  Thamizhannal P

--- On Thu, 25/8/11, Thamizh <tceg...@yahoo.co.in> wrote:

From: Thamizh <tceg...@yahoo.co.in>
Subject: Re: multi-node cassandra config doubt
To: user@cassandra.apache.org
Date: Thursday, 25 August, 2011, 9:01 PM

Hi Aaron,

Thanks a lot for your suggestions. I have got exhausted with below error. It 
would great if you point me what went wrong with my approach.

I wanted to install cassandra-0.8.4 on 3 nodes and to run Map/Reduce job that 
uploads data from HDFS to Cassandra.

I have installed Cassnadra on 3 nodes lab02(199.168.0.2),lab03(199.168.0.3) & 
lab04(199.168.0.4) respectively and can create a keyspace & column family and 
they got distributed across the cluster.

When I run my map/reduce program it ended up with "UnknownHostException". the 
same map/reduce program works well on single node cluster.


Here are the steps which I have followed.

1. cassandra.yaml details

lab02(199.168.0.2): (seed node)

auto_bootstrap: false
seeds: "199.168.0.2"
listen_address: 199.168.0.2
rpc_address:
 199.168.0.2

lab03(199.168.0.3):
auto_bootstrap: true
seeds: "199.168.0.2"
listen_address: 199.168.0.3
rpc_address: 199.168.0.3

lab04(199.168.0.4):
auto_bootstrap: true
seeds: "199.168.0.2"
listen_address: 199.168.0.4
rpc_address: 199.168.0.4


2.
O/P of bin/cassandra :
    ------
    ------
 INFO 11:59:40,602 Node /199.168.0.2 is now part of the cluster
 INFO 11:59:40,604 InetAddress /199.168.0.2 is now UP
 INFO 11:59:55,667 Node /199.168.0.4 is now part of the cluster
 INFO 11:59:55,669 InetAddress /199.168.0.4 is now UP
 INFO 12:01:08,389 Joining: getting bootstrap token
 INFO 12:01:08,410 New token will be 43083119672609054510947312506340649252 to 
assume load from /199.168.0.2
 INFO 12:01:08,412 Enqueuing flush of Memtable-LocationInfo@6824966(123/153 
serialized/live bytes, 4 ops)
 INFO 12:01:08,413
 Writing Memtable-LocationInfo@6824966(123/153 serialized/live bytes, 4 ops)
 INFO 12:01:08,461 Completed flushing 
/var/lib/cassandra/data/system/LocationInfo-g-2-Data.db (287 bytes)
 INFO 12:01:08,477 Node /199.168.0.3 state jump to normal
 INFO 12:01:08,480 Enqueuing flush of Memtable-LocationInfo@10141941(53/66 
serialized/live bytes, 2 ops)
 INFO 12:01:08,482 Writing Memtable-LocationInfo@10141941(53/66 serialized/live 
bytes, 2 ops)
 INFO 12:01:08,514 Completed flushing 
/var/lib/cassandra/data/system/LocationInfo-g-3-Data.db (163 bytes)
 INFO 12:01:08,527 Node /199.168.0.3 state jump to normal
 INFO 12:01:08,652 mx4j successfuly loaded
HttpAdaptor version 3.0.1 started on port 8081

3.
When I run my map/reduce program it ended up with "UnknownHostException"

Error: java.net.UnknownHostException: /199.168.0.2
    at
 java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849)
    at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200)
    at java.net.InetAddress.getAllByName0(InetAddress.java:1153)
    at java.net.InetAddress.getAllByName(InetAddress.java:1083)
    at java.net.InetAddress.getAllByName(InetAddress.java:1019)
    at java.net.InetAddress.getByName(InetAddress.java:969)
    at 
org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93)
    at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67)
    at 
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98)
    at
 
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92)
    at 
org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132)
    at 
org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)

Here are the config line for map/reduce.

        job4.setReducerClass(TblUploadReducer.class );
        job4.setOutputKeyClass(ByteBuffer.class);
        job4.setOutputValueClass(List.class);
       
 job4.setOutputFormatClass(ColumnFamilyOutputFormat.class);
        ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), 
args[1],args[3] );
        ConfigHelper.setRpcPort(job4.getConfiguration(),  args[7]); // 9160
        ConfigHelper.setInitialAddress(job4.getConfiguration(), args[9]); // 
199.168.0.2
        ConfigHelper.setPartitioner(job4.getConfiguration(), 
"org.apache.cassandra.dht.RandomPartitioner");

Steps which I have verified,
1. There is a passwordless ssh has been configured b/w lab02,lab03 &lab04. All 
the nodes can ping each other with out any issues.
2. When I ran "InetAddress.getLocalHost()" from java program on lab02 it prints 
"lab02/199.168.0.2".
3. When I over looked "o/p" of bin/cassandra it prints couple of messages and 
under InetAddress field "/199.168.0.3" etc.
Here
 it does not print "hostname/IP". Is that problem?

Kindly help me.

Regards,
Thamizhannal 

--- On Thu, 25/8/11, aaron morton <aa...@thelastpickle.com> wrote:

From: aaron morton <aa...@thelastpickle.com>
Subject: Re: multi-node cassandra config doubt
To: user@cassandra.apache.org
Date: Thursday, 25 August, 2011, 3:45 AM

Jump on the machine that raised the error and see if you can ssh to node01. 
or try using ip address to see if they work. 
Cheers

-----------------Aaron MortonFreelance Cassandra 
Developer@aaronmortonhttp://www.thelastpickle.com



On 24/08/2011, at 11:34 PM, Thamizh wrote:
Hi Aaron,

This is yet to be resolved. 

I have set-up Cassandra multi node clustering and facing issues in pushing HDFS 
data to Cassandra. When I ran "MapReduce" progrma I am getting 
UnknownHostException.

In hadoop(0.20.1), I have configured node01-as master and node01, node02 & 
node03 as slaves.

In Cassandra(0.8.4), the installation & configurations has been done. when I 
issue nodetool ring command I could see the ring and also the KEYSPACES & 
COLUMNFAMILYS have got distributed.

o/p: nodetool
$bin/nodetool -h node02 ring
Address         DC         
 Rack        Status State   Load            Owns   
 Token                                       
                                                                               
161930152162677484001961360738128229499     
198.168.0.1     datacenter1 rack1       Up     Normal  132.28 MB       12.48% 
 13027320554261208311902766005835168982      
198.168.0.2     datacenter1 rack1       Up     Normal  99.34 MB        75.07%  
140745249930211229277235689500208693608     
198.168.0.3     datacenter1 rack1       Up     Normal  66.21 KB        12.45%  
161930152162677484001961360738128229499     
nutch@lab02:/code/apache-cassandra-0.8.4$ 


Here are the hadoop config.

        job4.setOutputFormatClass(ColumnFamilyOutputFormat.class);
        ConfigHelper.setOutputColumnFamily(job4.getConfiguration(), 
KEYSPACE,COLUMN_FAMILY );
       
 ConfigHelper.setRpcPort(job4.getConfiguration(), ""9160);
        ConfigHelper.setInitialAddress(job4.getConfiguration(), "node01");
        ConfigHelper.setPartitioner(job4.getConfiguration(), 
"org.apache.cassandra.dht.RandomPartitioner");

Bleow is an exception message:

Error: java.net.UnknownHostException: /198.168.0.3
    at java.net.Inet6AddressImpl.lookupAllHostAddr(Native Method)
    at java.net.InetAddress$1.lookupAllHostAddr(InetAddress.java:849)
    at java.net.InetAddress.getAddressFromNameService(InetAddress.java:1200)
    at java.net.InetAddress.getAllByName0(InetAddress.java:1153)
    at java.net.InetAddress.getAllByName(InetAddress.java:1083)
    at java.net.InetAddress.getAllByName(InetAddress.java:1019)
    at
 java.net.InetAddress.getByName(InetAddress.java:969)
    at 
org.apache.cassandra.client.RingCache.refreshEndpointMap(RingCache.java:93)
    at org.apache.cassandra.client.RingCache.<init>(RingCache.java:67)
    at 
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:98)
    at 
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.<init>(ColumnFamilyRecordWriter.java:92)
    at 
org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:132)
    at 
org.apache.cassandra.hadoop.ColumnFamilyOutputFormat.getRecordWriter(ColumnFamilyOutputFormat.java:62)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:553)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at
 org.apache.hadoop.mapred.Child.main(Child.java:170)

note: Same /etc/hosts file has been used across all the nodes.

Kindly help me to resolve this issue?


Regards,

  Thamizhannal P

--- On Wed, 24/8/11, aaron morton <aa...@thelastpickle.com> wrote:

From: aaron morton <aa...@thelastpickle.com>
Subject: Re: multi-node cassandra config doubt
To: user@cassandra.apache.org
Date: Wednesday, 24 August, 2011, 2:40 PM

Did you get this sorted ? 
At a guess I would say there are no nodes listed in the Hadoop
 JobConf.
Cheers

-----------------Aaron MortonFreelance Cassandra 
Developer@aaronmortonhttp://www.thelastpickle.com



On 23/08/2011, at 9:51 PM, Thamizh wrote:
Hi All,

This is regarding multi-node cluster configuration doubt.

I have configured 3 nodes of cluster using Cassandra-0.8.4 and getting error 
when I ran Map/Reduce job which uploads records from HDFS to Cassandra.

Here are my 3 nodes cluster config file (cassandra.yaml) for Cassandra:

node01:
    seeds: "node01,node02,node03"
    auto_bootstrap: false
    listen_address: 192.168.0.1
    rpc_address: 192.168.0.1


node02:

seeds: "node01,node02,node03"
auto_bootstrap: true
listen_address: 192.168.0.2
rpc_address: 192.168.0.2


node03:
seeds: "node01,node02,node03"
auto_bootstrap:
 true
listen_address: 192.168.0.3
rpc_address: 192.168.0.3

When I ran M/R program, I am getting below error
11/08/23 04:37:00 INFO
 mapred.JobClient:  map 100% reduce 11%
11/08/23 04:37:06 INFO mapred.JobClient:  map 100% reduce 22%
11/08/23 04:37:09 INFO mapred.JobClient:  map 100% reduce 33%
11/08/23 04:37:14 INFO mapred.JobClient: Task Id : 
attempt_201104211044_0719_r_000000_0, Status : FAILED
java.lang.NullPointerException
    at org.apache.cassandra.client.RingCache.getRange(RingCache.java:130)
    at 
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:125)
    at 
org.apache.cassandra.hadoop.ColumnFamilyRecordWriter.write(ColumnFamilyRecordWriter.java:60)
    at 
org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
    at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:90)
    at CassTblUploader$TblUploadReducer.reduce(CassTblUploader.java:1)
  
  at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:563)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:408)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)


Is anything wrong on my cassandra.yaml file?

I followed http://wiki.apache.org/cassandra/MultinodeCluster for cluster 
configuration.

Regards,
Thamizhannal

Reply via email to