My question is, what is the best solution to make the master(namenode)
>>> fail over, i read a lot, but i dont now what is the best.
>>> I found this howto:
>>> http://www.cloudera.com/blog/2009/07/hadoop-ha-configuration/ , but if it is
>>> possible, i do not want to use DRBD.
>>> I hope somebody can help me. Sorry my english :)
>>>
>>> Thanks.
>>> Tibi
>>
>>
>
--
Joey Echeverria
Solutions Architect
Cloudera, Inc.
The current replica placement policy is not aware of multiple levels in your
topology. So, in your example, it would pick any of the other three racks:
/rackA/rack1, /rackB/rack3, or /rackB/rack4 with equal probability.
The only way to get the behavior you desire is to specify only one level of
You can append to a file in later versions of HDFS (0.21+), but there is no
support for modifying a file in place. What's your use case?
-Joey
On Tue, Feb 14, 2012 at 4:20 AM, Alieh Saeedi wrote:
> Hi
> Is there a way to modify a file?
> Thanks:-)
>
--
Joseph Echeverria
Cloudera, Inc.
443.3
Hey Stuti,
Hadoop doesn't get configured for LDAP per se, you just need to configure
your nodes to do LDAP authentication via pam. Here's a guide:
http://www.howtoforge.com/linux_ldap_authentication
Assuming you already have an LDAP server setup, you probably only care
about the "Client configur
You need to restart the namenode for the new setting to take effect.
The namenode will be unavailable for a while during the restart.
-Joey
On Tue, Jan 31, 2012 at 5:26 PM, Jain, Prem wrote:
> Team,
>
> I would like to add additional directory under “dfs.name.dir” property in
> order to have one
Three disks each mounted separately. What you say is true, it will
handle failures better and generally perform better. You'll need to
configure the dfs.datanode.failed.volumes.tolerated parameter in
hdfs-site.xml to make sure that it handles a single failed volume
gracefully.
-Joey
On Mon, Jan 3
Personally I would just use Har :) It sounds like an interesting
project. You might find this document helpful:
http://kazman.shidler.hawaii.edu/ArchDoc.html
It was designed to help contributors navigate the HDFS source tree.
-Joey
On Thu, Jan 19, 2012 at 11:52 AM, Sesha Kumar wrote:
> I'm cu
ve access.
>
> I do not want that users which have access on a directory level can see all
> the inner content even if they do not have access permission on them.
>
> I thought of attaining it using ACL's . Is there any other way through which
> I can achieve this goa
What classpath is fuse_dfs using? It looks like your missing some jars.
-Joey
On Jan 19, 2012, at 5:40, Stuti Awasthi wrote:
> No, I have not set any security in conf/sites file. My sites file have just
> basic entries to start the hdfs cluster.
>
> -Original Message-
> From: alo a
HDFS only supports Unix style read, write execute permissions. What
style of ACLs do you want to apply?
-Joey
On Wed, Jan 18, 2012 at 7:55 AM, Stuti Awasthi wrote:
> Thanks Alex,
> Yes, I wanted to apply ACL's on every file/directory created on HDFS. Is
> there absolutely no way to achieve that
Yes.
On Jan 18, 2012, at 3:39, Stuti Awasthi wrote:
> Ok. Thanks Arun
> So is Hadoop-1.0.0 is compatible with Hbase stable release with append
> support ?
>
> -Original Message-
> From: Arun C Murthy [mailto:a...@hortonworks.com]
> Sent: Wednesday, January 18, 2012 1:30 PM
> To: hd
Sesha,
What kind of processing are you attempting to do? Maybe it makes more sense
to just implement a MapReduce job rather than modifying the datanodes?
-Joey
On Mon, Jan 16, 2012 at 9:20 AM, Sesha Kumar wrote:
> Hey guys,
>
> Sorry for the typo in my last message.I have corrected it.
>
> I w
Yup, just start reading from wherever the block starts and stop at the
end of the block to do local reads.
-Joey
On Wed, Jan 11, 2012 at 11:31 AM, David Pavlis wrote:
> Hi Todd,
>
> If I use the FileSystem API and I am on local node - how do I get/read just
> that particular block residing local
Take a look at Hue, it a web app that does exactly what you're talking about.
It uses a combination of RPC calls and other public APIs as well as a custom
plugin that adds additional APIs via thrift.
Hue can be deployed on any node and is mostly written in Python.
-Joey
On Jan 11, 2012, at
> Frank Grimes
>
>
> On 2012-01-06, at 1:05 PM, Joey Echeverria wrote:
>
>> I would do it by staging the machine data into a temporary directory
>> and then renaming the directory when it's been verified. So, data
>> would be written into directories like this:
I would do it by staging the machine data into a temporary directory
and then renaming the directory when it's been verified. So, data
would be written into directories like this:
2012-01/02/00/stage/machine1.log.avro
2012-01/02/00/stage/machine2.log.avro
2012-01/02/00/stage/machine3.log.avro
Aft
Don't start your daemons as root. They should be started as a system
account. Typically hdfs for the HDFS services and mapred for the
MapReduce ones.
-Joey
On Fri, Dec 23, 2011 at 4:04 AM, Martinus Martinus
wrote:
> Hi Ayon,
>
> I tried to setup the hadoop-cluster using hadoop-0.20.2 and it seem
HDFS doesn't natively support NFS. In order to export HDFS via NFS you'd have
to mount it to the local file system with fuse and then export that directory.
In that case, all traffic would go through the host acting as the NFS server.
-Joey
On Dec 16, 2011, at 19:07, Mark Hedges wrote:
>
>
How long did you wait after copying? I've seen this behavior before and it's
due to the semantics of close in fuse and not easily fixed in fuse-dfs. In a
minute or so though the copy should have the right size.
-Joey
On Dec 16, 2011, at 1:55, Stuti Awasthi wrote:
> Hi All,
>
> I installed a
Those versions should work fine together. Did you get Hadoop
configured for psuedo distribtued mode correctly, or are you having
trouble with both?
-Joey
On Fri, Dec 9, 2011 at 4:57 AM, Mohammad Tariq wrote:
> Is there any specific combination of Hadoop and Hbase in order to use
> Hbase in atlea
You could check out Hoop[1], a REST interface for accessing HDFS.
Since it's REST based, you can easily load balance clients across
multiple servers. You'll have to write the C/C++ code for
communicating with Hoop, but that shouldn't require too much more than
a thin wrapper around an HTTP client l
Hey Stuti,
Fuse is probably the most commonly used solution. It has some
limitations because HDFS isn't posix compliant, but it it works for a
lot of use cases. You can try out both the contrib driver and the
google code version. I'm not sure which will work better for your
Hadoop version. Newer H
The balancer only balances between datanodes. This means the new
drives won't get used until you start writing new data to them. If you
want to balance the drives on a node, you need to
1) copy a bunch of block files from the old drives to the new drives
2) shutdown the datanode
3) delete the old
What is the output of the following:
hadoop fs -ls
hdfs://10.0.0.61/user/kiranprasad.g/pig-0.8.1/tutorial/data/excite-small.log
-Joey
On Tue, Sep 20, 2011 at 1:44 AM, kiranprasad
wrote:
> Hi
>
> When I have run the same from local mode it is working fine and I got the
> result, but on Hadoop f
On the NN:
rm -rf ${dfs.name.dir}/*
On the DN:
rm -rf ${dfs.data.dir}/*
-Joey
On Fri, Sep 16, 2011 at 7:21 AM, kiranprasad
wrote:
> What do I need to clear from the hadoop directory.
>
> -Original Message- From: Stephan Gammeter
> Sent: Friday, September 16, 2011 3:57 PM
> To: hdfs-us
e
> /user/hdfs/files/d954x328-85x8-4dfe-b73c-34a7a2c1xb0f is closed by
> DFSClient_1277823200
>
> Is there any way I can find out from the log when the safe mode gets over.
>
> Regards,
> Rahul
>
> On Thu, Jul 28, 2011 at 6:16 PM, Joey Echeverria wrote:
>>
>>
3_15838442 reported from xx.xx.xx.xx:50010
> current size is 1950720 reported size is 2448907
>
> I think the edit file size was too huge thats why it took long time.
>
> Regards,
> Rahul
>
> On Fri, Jul 22, 2011 at 9:33 PM, Joey Echeverria wrote:
> The lon
You could do it with streaming and a single reducer:
bin/hadoop jar $HADOOP_HOME/hadoop-0.20.2-streaming.jar
-Dmapred.num.reduce.tasks=1 -reducer cat -input
/hdfs/directory/allsource* -output
mergefile -verbose
-Joey
On Fri, Jul 22, 2011 at 1:26 PM, Time Less wrote:
> Hello, List!
>
> I have s
ption: Call to xx.xx.xx.xx:9000 failed on local exception:
> java.io.IOException: Connection reset by peer
>
> Regards,
> Rahul
>
>
> On Fri, Jul 22, 2011 at 5:40 PM, Joey Echeverria wrote:
>
>> Do you have an instance of the SecondaryNamenode in your cluster?
>>
>>
Do you have an instance of the SecondaryNamenode in your cluster?
-Joey
On Fri, Jul 22, 2011 at 3:15 AM, Rahul Das wrote:
> Hi,
>
> I am running a Hadoop cluster with 20 Data node. Yesterday I found that the
> Namenode was not responding ( No write/read to HDFS is happening). It got
> stuck for
Facebook contributed some code to do something similar called HDFS RAID:
http://wiki.apache.org/hadoop/HDFS-RAID
-Joey
On Jul 18, 2011, at 3:41, Da Zheng wrote:
> Hello,
>
> It seems that data replication in HDFS is simply data copy among nodes. Has
> anyone considered to use a better encodi
HBase does not require MapReduce.
-Joey
On Jul 16, 2011, at 11:46, Rita wrote:
> So, I use hdfs to store very large files and access them thru various client
> (100 clients) using FS utils. Are there any other tools or projects that
> solely use hdfs as its storage for fast access? I know
By any chance, do you have 3 directories set in dfs.data.dir all of which
are on /dev/hda1?
-Joey
On Mon, Jun 13, 2011 at 3:01 PM, Time Less wrote:
> I have a datanode with a ~900GB hard drive in it:
>
> FilesystemSize Used Avail Use% Mounted on
> /dev/hda1 878G 384G
any negative consequences of running the fsck -move just to
> try it?
>
> On Jun 10, 2011, at 3:33 PM, Joey Echeverria wrote:
>
>> Good question. I didn't pick up on the fact that fsck disagrees with
>> dfsadmin. Have you tried a full restart? Maybe somebody's infor
d tell me if there were issues.
>
> So will running hadoop fsck -move just move the corrupted replicas and leave
> the good ones? Will this work even though fsck does not report any corruption?
>
> On Jun 9, 2011, at 3:20 PM, Joey Echeverria wrote:
>
>> hadoop fsck -move will mo
27;ll cut your usage by another 1/3. This
becomes very significant very quickly.
-Joey
On Fri, Jun 10, 2011 at 12:36 PM, Anh Nguyen wrote:
> On 06/10/2011 04:57 AM, Joey Echeverria wrote:
>>
>> Hi On,
>>
>> The namenode stores the full filesystem image in memory. Lo
Hi On,
The namenode stores the full filesystem image in memory. Looking at
your stats, you have ~30 million files/directories and ~47 million
blocks. That means that on average, each of your files is only ~1.4
blocks in size. One way to lower the pressure on the namenode would
be to store fewer,
hadoop fsck -move will move the corrupt files to /lost+found, which
will "fix" the report.
Do you know what created the corrupt files?
-Joey
On Thu, Jun 9, 2011 at 3:04 PM, Robert J Berger wrote:
> I'm still having this problem and am kind of paralyzed until I figure out how
> to eliminate the
They write directly to HDFS, there's no additional buffering on the
local file system of the client.
-Joey
On Tue, May 31, 2011 at 7:56 PM, Mapred Learn wrote:
> Hi guys,
> I asked this question earlier but did not get any response. So, posting
> again. Hope somebody can point to the right descr
How much memory do you have on your DataNode? Is it possible that
you're swapping?
-Joey
On Mon, May 30, 2011 at 11:09 PM, ccxixicc wrote:
>
> Hi,all
> I found NameNode often lost heartbeat from DataNodes:
> org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.heartbeatCheck: lost
> heartbeat f
fic conf and/or in mapred-site.xml.
>>
>>
>> Friso
>>
>>
>>
>> On 19 mei 2011, at 03:42, Steve Cohen wrote:
>>
>>> Where is the default replication factor on job files set? Is it different
>>> then the dfs.replication setting i
KFS
On May 18, 2011 7:03 PM, "Thanh Do" wrote:
> hi hdfs users,
>
> Is anybody aware of a system
> that is similar to HDFS, in the sense
> that it has single master architecture,
> and the master also keeps an operation log.
>
> Thanks,
>
> Thanh
Did you run a map reduce job?
I think the default replication factor on job files is 10, which
obviously doesn't work well on a psuedo-distributed cluster.
-Joey
On Wed, May 18, 2011 at 5:07 PM, Steve Cohen wrote:
> Thanks for the answer. Earlier, I asked about why I get occasional not
> repli
Which version of hadoop are you running?
I'm pretty sure the problem is you're over committing your RAM. Hadoop
really doesn't like swapping. I would try setting your
mapred.child.java.opts to
-Xmx1024m.
-Joey
On Wed, May 11, 2011 at 2:23 AM, Evert Lammerts wrote:
> Hi list,
>
> I notice that w
How much data do you have? It takes some time for all of the datanodes
to report that all blocks are accounted for.
-Joey
On Wed, May 4, 2011 at 4:05 PM, Himanshu Vashishtha
wrote:
> Hey,
> Every thing comes up for good.
> Why this delay of 6 minutes I wonder? And I see that this delay has nothi
45 matches
Mail list logo