Re: how to add a shareable node label?

2016-10-04 Thread Sunil Govind
Hi Frank,

As far as I checked, all labels are "exclusive" in 2.7. In upcoming 2.8
release, we can get "non-exclusive" or sharable node labels.

Thanks
Sunil

On Wed, Oct 5, 2016 at 8:40 AM Frank Luo  wrote:

> I am using Hadoop 2.7.3, when I run:
>
> $ yarn rmadmin -addToClusterNodeLabels "Label1(exclusive=false)"
>
>
>
> I got an error as:
>
> … addToClusterNodeLabels: java.io.IOException: label name should only
> contains {0-9, a-z, A-Z, -, _} and should not started with {-,_}
>
>
>
> If I just use “Label1”, it will work fine, but I want a shareable one.
>
>
>
> Anyone knows a better way to do it?
>
> *Access the Q2 2016 Digital Marketing Report for a fresh set of trends and
> benchmarks in digital marketing*
> 
>
> *Download our latest report titled “The Case for Change: Exploring the
> Myths of Customer-Centric Transformation” *
> 
>
> This email and any attachments transmitted with it are intended for use by
> the intended recipient(s) only. If you have received this email in error,
> please notify the sender immediately and then delete it. If you are not the
> intended recipient, you must not keep, use, disclose, copy or distribute
> this email without the author’s prior permission. We take precautions to
> minimize the risk of transmitting software viruses, but we advise you to
> perform your own virus checks on any attachment to this message. We cannot
> accept liability for any loss or damage caused by software viruses. The
> information contained in this communication may be confidential and may be
> subject to the attorney-client privilege.
>


how to add a shareable node label?

2016-10-04 Thread Frank Luo
I am using Hadoop 2.7.3, when I run:
$ yarn rmadmin -addToClusterNodeLabels "Label1(exclusive=false)"

I got an error as:

… addToClusterNodeLabels: java.io.IOException: label name should only contains 
{0-9, a-z, A-Z, -, _} and should not started with {-,_}

If I just use “Label1”, it will work fine, but I want a shareable one.

Anyone knows a better way to do it?

Access the Q2 2016 Digital Marketing Report for a fresh set of trends and 
benchmarks in digital 
marketing

Download our latest report titled “The Case for Change: Exploring the Myths of 
Customer-Centric Transformation” 


This email and any attachments transmitted with it are intended for use by the 
intended recipient(s) only. If you have received this email in error, please 
notify the sender immediately and then delete it. If you are not the intended 
recipient, you must not keep, use, disclose, copy or distribute this email 
without the author’s prior permission. We take precautions to minimize the risk 
of transmitting software viruses, but we advise you to perform your own virus 
checks on any attachment to this message. We cannot accept liability for any 
loss or damage caused by software viruses. The information contained in this 
communication may be confidential and may be subject to the attorney-client 
privilege.


Re: native snappy library not available: this version of libhadoop was built without snappy support.

2016-10-04 Thread Wei-Chiu Chuang
It seems to me this issue is the direct result of MAPREDUCE-6577 

Since you’re on a CDH cluster, I would suggest you to move up to CDH5.7.2 or 
above where this bug is fixed.

Best,
Wei-Chiu Chuang

> On Oct 4, 2016, at 1:26 PM, Wei-Chiu Chuang  wrote:
> 
> I see. Sorry for the confusion.
> 
> It seems to me the warning message a bit misleading. This message may also be 
> printed if libhadoop can not be loaded for any reason.
> Can you turn on debug log and see if the log contains either "Loaded the 
> native-hadoop library” or "Failed to load native-hadoop with error”?
> 
> 
> Wei-Chiu Chuang
> 
>> On Oct 4, 2016, at 1:12 PM, Uthayan Suthakar > > wrote:
>> 
>> Hi Wei-Chiu,
>> 
>> My Hadoop version is Hadoop 2.6.0-cdh5.7.0.
>> 
>> But when I checked the native, it shows that it is installed:
>> 
>> hadoop checknative
>> 16/10/04 21:01:30 INFO bzip2.Bzip2Factory: Successfully loaded & initialized 
>> native-bzip2 library system-native
>> 16/10/04 21:01:30 INFO zlib.ZlibFactory: Successfully loaded & initialized 
>> native-zlib library
>> Native library checking:
>> hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
>> zlib:true /lib64/libz.so.1
>> snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
>> lz4: true revision:99
>> bzip2:   true /lib64/libbz2.so.1
>> openssl: true /usr/lib64/libcrypto.so
>> 
>> Thanks.
>> 
>> Uthay
>> 
>> 
>> On 4 October 2016 at 21:05, Wei-Chiu Chuang > > wrote:
>> Hi Uthayan,
>> what’s the version of Hadoop you have? Hadoop 2.7.3 binary does not ship 
>> with snappy precompiled. If this is the version you have you may have to 
>> rebuild Hadoop yourself to include it.
>> 
>> Wei-Chiu Chuang
>> 
>>> On Oct 4, 2016, at 12:59 PM, Uthayan Suthakar >> > wrote:
>>> 
>>> Hello guys,
>>> 
>>> I have a job that reads compressed (Snappy) data but when I run the job, it 
>>> is throwing an error "native snappy library not available: this version of 
>>> libhadoop was built without snappy support".
>>> .  
>>> I followed this instruction but it did not resolve the issue:
>>> https://community.hortonworks.com/questions/18903/this-version-of-libhadoop-was-built-without-snappy.html
>>>  
>>> 
>>> 
>>> The check native command show that snappy is installed.
>>> hadoop checknative
>>> 16/10/04 21:01:30 INFO bzip2.Bzip2Factory: Successfully loaded & 
>>> initialized native-bzip2 library system-native
>>> 16/10/04 21:01:30 INFO zlib.ZlibFactory: Successfully loaded & initialized 
>>> native-zlib library
>>> Native library checking:
>>> hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
>>> zlib:true /lib64/libz.so.1
>>> snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
>>> lz4: true revision:99
>>> bzip2:   true /lib64/libbz2.so.1
>>> openssl: true /usr/lib64/libcrypto.so
>>> 
>>> I also have a code in the job to check whether native snappy is loaded, 
>>> which is returning true.
>>> 
>>> Now, I have no idea why I'm getting this error. Also, I had no issue 
>>> reading Snappy data using MapReduce job on the same cluster, Could anyone 
>>> tell me what is wrong?
>>> 
>>> 
>>> 
>>> Thank you.
>>> 
>>> Stack:
>>> 
>>> 
>>> java.lang.RuntimeException: native snappy library not available: this 
>>> version of libhadoop was built without snappy support.
>>> at 
>>> org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65)
>>> at 
>>> org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193)
>>> at 
>>> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:178)
>>> at 
>>> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:111)
>>> at 
>>> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>>> at 
>>> org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:237)
>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
>>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>> at 
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>> at 
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>>> at 
>>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>>> at org.apache.spark.rdd.RDD.

Re: native snappy library not available: this version of libhadoop was built without snappy support.

2016-10-04 Thread Wei-Chiu Chuang
I see. Sorry for the confusion.

It seems to me the warning message a bit misleading. This message may also be 
printed if libhadoop can not be loaded for any reason.
Can you turn on debug log and see if the log contains either "Loaded the 
native-hadoop library” or "Failed to load native-hadoop with error”?


Wei-Chiu Chuang

> On Oct 4, 2016, at 1:12 PM, Uthayan Suthakar  
> wrote:
> 
> Hi Wei-Chiu,
> 
> My Hadoop version is Hadoop 2.6.0-cdh5.7.0.
> 
> But when I checked the native, it shows that it is installed:
> 
> hadoop checknative
> 16/10/04 21:01:30 INFO bzip2.Bzip2Factory: Successfully loaded & initialized 
> native-bzip2 library system-native
> 16/10/04 21:01:30 INFO zlib.ZlibFactory: Successfully loaded & initialized 
> native-zlib library
> Native library checking:
> hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
> zlib:true /lib64/libz.so.1
> snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
> lz4: true revision:99
> bzip2:   true /lib64/libbz2.so.1
> openssl: true /usr/lib64/libcrypto.so
> 
> Thanks.
> 
> Uthay
> 
> 
> On 4 October 2016 at 21:05, Wei-Chiu Chuang  > wrote:
> Hi Uthayan,
> what’s the version of Hadoop you have? Hadoop 2.7.3 binary does not ship with 
> snappy precompiled. If this is the version you have you may have to rebuild 
> Hadoop yourself to include it.
> 
> Wei-Chiu Chuang
> 
>> On Oct 4, 2016, at 12:59 PM, Uthayan Suthakar > > wrote:
>> 
>> Hello guys,
>> 
>> I have a job that reads compressed (Snappy) data but when I run the job, it 
>> is throwing an error "native snappy library not available: this version of 
>> libhadoop was built without snappy support".
>> .  
>> I followed this instruction but it did not resolve the issue:
>> https://community.hortonworks.com/questions/18903/this-version-of-libhadoop-was-built-without-snappy.html
>>  
>> 
>> 
>> The check native command show that snappy is installed.
>> hadoop checknative
>> 16/10/04 21:01:30 INFO bzip2.Bzip2Factory: Successfully loaded & initialized 
>> native-bzip2 library system-native
>> 16/10/04 21:01:30 INFO zlib.ZlibFactory: Successfully loaded & initialized 
>> native-zlib library
>> Native library checking:
>> hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
>> zlib:true /lib64/libz.so.1
>> snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
>> lz4: true revision:99
>> bzip2:   true /lib64/libbz2.so.1
>> openssl: true /usr/lib64/libcrypto.so
>> 
>> I also have a code in the job to check whether native snappy is loaded, 
>> which is returning true.
>> 
>> Now, I have no idea why I'm getting this error. Also, I had no issue reading 
>> Snappy data using MapReduce job on the same cluster, Could anyone tell me 
>> what is wrong?
>> 
>> 
>> 
>> Thank you.
>> 
>> Stack:
>> 
>> 
>> java.lang.RuntimeException: native snappy library not available: this 
>> version of libhadoop was built without snappy support.
>> at 
>> org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65)
>> at 
>> org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193)
>> at 
>> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:178)
>> at 
>> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:111)
>> at 
>> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
>> at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:237)
>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
>> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at 
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at 
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at 
>> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
>> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
>> at 
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
>> at 
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>> at org.apache.spark.scheduler.Task.run(Task.scala:89)
>> at 
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
>> at 
>> java.util.concurr

Test for INFRA-12554 - please ignore (user)

2016-10-04 Thread sebb
Test for INFRA-12554 - please ignore

-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org



HDFS Issues.

2016-10-04 Thread Steve Brenneis
I have an HDFS cluster of three nodes. They are all running on Amazon EC2 
instances. I am using HDFS for an HBase backing store. Periodically, I will 
start the cluster and the name node stays in safe mode because it says the 
number of live datanodes has dropped to 0.

The number of live datanodes 2 has reached the minimum number 0. Safe mode will 
be turned off automatically once the thresholds have been reached.
The datanode logs appear to be normal, with no errors indicated. The dfsadmin 
report says the datanodes are both normal and that the name node is in contact 
with them.

Safe mode is ON
Configured Capacity: 16637566976 (15.49 GB)
Present Capacity: 7941234688 (7.40 GB)
DFS Remaining: 7940620288 (7.40 GB)
DFS Used: 614400 (600 KB)
DFS Used%: 0.01%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-
Live datanodes (2):

Name: 172.31.52.176:50010 (dev2)
Hostname: dev2
Decommission Status : Normal
Configured Capacity: 8318783488 (7.75 GB)
DFS Used: 307200 (300 KB)
Non DFS Used: 3257020416 (3.03 GB)
DFS Remaining: 5061455872 (4.71 GB)
DFS Used%: 0.00%
DFS Remaining%: 60.84%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 04 15:47:00 EDT 2016


Name: 172.31.63.188:50010 (dev1)
Hostname: dev1
Decommission Status : Normal
Configured Capacity: 8318783488 (7.75 GB)
DFS Used: 307200 (300 KB)
Non DFS Used: 5439311872 (5.07 GB)
DFS Remaining: 2879164416 (2.68 GB)
DFS Used%: 0.00%
DFS Remaining%: 34.61%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Tue Oct 04 15:47:00 EDT 2016
If I force the name node out of safe mode, the fsck commmand says that the file 
system is corrupt. When this happens, the only thing I've been able to do to 
get it back is to format the HDFS file system. I have not changed the 
configuration of the cluster. This just randomly seems to occur. The system is 
in development, but this will be unacceptable in production.

I’m using version 2.7.3. Thank you in advance for any help.



Re: native snappy library not available: this version of libhadoop was built without snappy support.

2016-10-04 Thread Uthayan Suthakar
Hi Wei-Chiu,

My Hadoop version is Hadoop 2.6.0-cdh5.7.0.

But when I checked the native, it shows that it is installed:

hadoop checknative
16/10/04 21:01:30 INFO bzip2.Bzip2Factory: Successfully loaded &
initialized native-bzip2 library system-native
16/10/04 21:01:30 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
Native library checking:
hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
zlib:true /lib64/libz.so.1
snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
lz4: true revision:99
bzip2:   true /lib64/libbz2.so.1
openssl: true /usr/lib64/libcrypto.so

Thanks.

Uthay


On 4 October 2016 at 21:05, Wei-Chiu Chuang  wrote:

> Hi Uthayan,
> what’s the version of Hadoop you have? Hadoop 2.7.3 binary does not ship
> with snappy precompiled. If this is the version you have you may have to
> rebuild Hadoop yourself to include it.
>
> Wei-Chiu Chuang
>
> On Oct 4, 2016, at 12:59 PM, Uthayan Suthakar 
> wrote:
>
> Hello guys,
>
> I have a job that reads compressed (Snappy) data but when I run the job,
> it is throwing an error "native snappy library not available: this version
> of libhadoop was built without snappy support".
> .
> I followed this instruction but it did not resolve the issue:
> https://community.hortonworks.com/questions/18903/this-versi
> on-of-libhadoop-was-built-without-snappy.html
>
> The check native command show that snappy is installed.
> hadoop checknative
> 16/10/04 21:01:30 INFO bzip2.Bzip2Factory: Successfully loaded &
> initialized native-bzip2 library system-native
> 16/10/04 21:01:30 INFO zlib.ZlibFactory: Successfully loaded & initialized
> native-zlib library
> Native library checking:
> hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
> zlib:true /lib64/libz.so.1
> snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
> lz4: true revision:99
> bzip2:   true /lib64/libbz2.so.1
> openssl: true /usr/lib64/libcrypto.so
>
> I also have a code in the job to check whether native snappy is loaded,
> which is returning true.
>
> Now, I have no idea why I'm getting this error. Also, I had no issue
> reading Snappy data using MapReduce job on the same cluster, Could anyone
> tell me what is wrong?
>
>
>
> Thank you.
>
> Stack:
>
>
> java.lang.RuntimeException: native snappy library not available: this
> version of libhadoop was built without snappy support.
> at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoa
> ded(SnappyCodec.java:65)
> at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorTyp
> e(SnappyCodec.java:193)
> at org.apache.hadoop.io.compress.CodecPool.getDecompressor(Code
> cPool.java:178)
> at org.apache.hadoop.mapred.LineRecordReader.(LineRecordR
> eader.java:111)
> at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(Tex
> tInputFormat.java:67)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.
> scala:237)
> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsR
> DD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsR
> DD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsR
> DD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap
> Task.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMap
> Task.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.
> scala:214)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPool
> Executor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoo
> lExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
>
>
>


Re: native snappy library not available: this version of libhadoop was built without snappy support.

2016-10-04 Thread Wei-Chiu Chuang
Hi Uthayan,
what’s the version of Hadoop you have? Hadoop 2.7.3 binary does not ship with 
snappy precompiled. If this is the version you have you may have to rebuild 
Hadoop yourself to include it.

Wei-Chiu Chuang

> On Oct 4, 2016, at 12:59 PM, Uthayan Suthakar  
> wrote:
> 
> Hello guys,
> 
> I have a job that reads compressed (Snappy) data but when I run the job, it 
> is throwing an error "native snappy library not available: this version of 
> libhadoop was built without snappy support".
> .  
> I followed this instruction but it did not resolve the issue:
> https://community.hortonworks.com/questions/18903/this-version-of-libhadoop-was-built-without-snappy.html
>  
> 
> 
> The check native command show that snappy is installed.
> hadoop checknative
> 16/10/04 21:01:30 INFO bzip2.Bzip2Factory: Successfully loaded & initialized 
> native-bzip2 library system-native
> 16/10/04 21:01:30 INFO zlib.ZlibFactory: Successfully loaded & initialized 
> native-zlib library
> Native library checking:
> hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
> zlib:true /lib64/libz.so.1
> snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
> lz4: true revision:99
> bzip2:   true /lib64/libbz2.so.1
> openssl: true /usr/lib64/libcrypto.so
> 
> I also have a code in the job to check whether native snappy is loaded, which 
> is returning true.
> 
> Now, I have no idea why I'm getting this error. Also, I had no issue reading 
> Snappy data using MapReduce job on the same cluster, Could anyone tell me 
> what is wrong?
> 
> 
> 
> Thank you.
> 
> Stack:
> 
> 
> java.lang.RuntimeException: native snappy library not available: this version 
> of libhadoop was built without snappy support.
> at 
> org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(SnappyCodec.java:65)
> at 
> org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(SnappyCodec.java:193)
> at 
> org.apache.hadoop.io.compress.CodecPool.getDecompressor(CodecPool.java:178)
> at 
> org.apache.hadoop.mapred.LineRecordReader.(LineRecordReader.java:111)
> at 
> org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.(HadoopRDD.scala:237)
> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
> at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:89)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)



native snappy library not available: this version of libhadoop was built without snappy support.

2016-10-04 Thread Uthayan Suthakar
Hello guys,

I have a job that reads compressed (Snappy) data but when I run the job, it
is throwing an error "native snappy library not available: this version
of libhadoop was built without snappy support".
.
I followed this instruction but it did not resolve the issue:
https://community.hortonworks.com/questions/18903/this-
version-of-libhadoop-was-built-without-snappy.html

The check native command show that snappy is installed.
hadoop checknative
16/10/04 21:01:30 INFO bzip2.Bzip2Factory: Successfully loaded &
initialized native-bzip2 library system-native
16/10/04 21:01:30 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
Native library checking:
hadoop:  true /usr/lib/hadoop/lib/native/libhadoop.so.1.0.0
zlib:true /lib64/libz.so.1
snappy:  true /usr/lib/hadoop/lib/native/libsnappy.so.1
lz4: true revision:99
bzip2:   true /lib64/libbz2.so.1
openssl: true /usr/lib64/libcrypto.so

I also have a code in the job to check whether native snappy is loaded,
which is returning true.

Now, I have no idea why I'm getting this error. Also, I had no issue
reading Snappy data using MapReduce job on the same cluster, Could anyone
tell me what is wrong?



Thank you.

Stack:


java.lang.RuntimeException: native snappy library not available: this
version of libhadoop was built without snappy support.
at org.apache.hadoop.io.compress.SnappyCodec.checkNativeCodeLoaded(
SnappyCodec.java:65)
at org.apache.hadoop.io.compress.SnappyCodec.getDecompressorType(
SnappyCodec.java:193)
at org.apache.hadoop.io.compress.CodecPool.getDecompressor(
CodecPool.java:178)
at org.apache.hadoop.mapred.LineRecordReader.(
LineRecordReader.java:111)
at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(
TextInputFormat.java:67)
at org.apache.spark.rdd.HadoopRDD$$anon$1.(
HadoopRDD.scala:237)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:208)
at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:101)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(
MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(
MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.rdd.MapPartitionsRDD.compute(
MapPartitionsRDD.scala:38)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:270)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(
ShuffleMapTask.scala:73)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(
ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(
Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(
ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(
ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)


HDFS Replication Issue

2016-10-04 Thread Eric Swenson
I have set up a single node cluster (initially) and am attempting to
write a file from a client outside the cluster.  I’m using the Java
org.apache.hadoop.fs.FileSystem interface to write the file.

While the write call returns, the close call hangs for a very long time, 
eventually
returns, but the resulting file in HDFS is 0 bytes in length. The namenode log
says:

2016-10-03 22:01:41,367 INFO BlockStateChange: chooseUnderReplicatedBlocks 
selected 1 blocks at priority level 0;  Total=1 Reset bookmarks? true
2016-10-03 22:01:41,367 INFO BlockStateChange: BLOCK* neededReplications = 1, 
pendingReplications = 0.
2016-10-03 22:01:41,367 INFO 
org.apache.hadoop.hdfs.server.blockmanagement.BlockManager: Blocks chosen but 
could not be replicated = 1; of which 1 have no target, 0 have no source, 0 are 
UC, 0 are abandoned, 0 already have enough replicas.

Why is the block not written to the single datanode (same as
namenode)? What does it mean to "have no target"? The replication
count is 1 and I would have thought that a single copy of the file
would be stored on the single cluster node.

I decided to see what happened if I added a second node to the cluster.  
Essentially the same thing happens.  The file (in HDFS) ends up being 
zero-length, and I get similar messages from the NameNode telling me that there 
are additional neededReplications and that none of the blocks could be 
replicated because they “have no target”.

If I SSH into the combined Name/Data node instance and use the “hdfs dfs -put” 
command, I have no trouble storing files.  I’m using the same user regardless 
of whether I’m using a remote fs.write operation or whether I’m using the “hdfs 
dfs -put” command while logged into the NameNode.  

What am I doing wrong?  — Eric




-
To unsubscribe, e-mail: user-unsubscr...@hadoop.apache.org
For additional commands, e-mail: user-h...@hadoop.apache.org