andy2005...@gmail.com
when set the IP to localhost, it works well, but if change localhost into
IP
address, it does not work at all.
so, it is to say my hadoop is ok, just the connection failed.
Rasit OZDAS wrote:
Your hadoop isn't working at all or isn't working at the specified port
Your hadoop isn't working at all or isn't working at the specified port.
- try stop-all.sh command on namenode. if it says no namenode to stop,
then take a look at namenode logs and paste here if anything seems strange.
- If namenode logs are ok (filled with INFO messages), then take a look at
all
I have the similar situation, I have very small files,
I never tried HBase (want to), but you can also group them
and write (let's say) 20-30 into a file as every file becomes a key in that
big file.
There are methods in API which you can write an object as a file into HDFS,
and read again
to get
Take a look at this topic:
http://dsonline.computer.org/portal/site/dsonline/menuitem.244c5fa74f801883f1a516106bbe36ec/index.jsp?pName=dso_level1_aboutpath=dsonline/topics/agentsfile=about.xmlxsl=generic.xsl;
2009/4/14 Burak ISIKLI burak.isi...@yahoo.com:
Hello everyone;
I want to write a
It's normal that they are all empty. Look at files with .log extension.
12 Nisan 2009 Pazar 23:30 tarihinde halilibrahimcakir
halilibrahimca...@mynet.com yazdı:
I followed these steps:
$ bin/stop-all.sh
$ rm -ri /tmp/hadoop-root
$ bin/hadoop namenode -format
$ bin/start-all.sh
and looked
Does your system request a password when you ssh to localhost outside hadoop?
12 Nisan 2009 Pazar 20:51 tarihinde halilibrahimcakir
halilibrahimca...@mynet.com yazdı:
Hi
I am new at hadoop. I downloaded Hadoop-0.19.0 and followed the
instructions in the quick start
There are two commands in hadoop quick start, used for passwordless ssh.
Try those.
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub ~/.ssh/authorized_keys
http://hadoop.apache.org/core/docs/current/quickstart.html
--
M. Raşit ÖZDAŞ
@Nick, I'm using ajax very often and previously done projects with ZK
and JQuery, I can easily say that GWT was the easiest of them.
Javascript is only needed where core features aren't enough. I can
easily assume that we won't need any inline javascript.
@Philip,
Thanks for the point. That is a
that they're sent to web ui as application parameters when
hadoop initializes.
I'll try to contribute gui part of my project to hadoop source, if you
want, no problem.
But I need static references to namenode and jobtracker for this.
And I think it will be useful for everyone like me.
M. Rasit
MultipleOutputFormat would be what you want. It supplies multiple files as
output.
I can paste some code here if you want..
2009/4/2 Vishal Ghawate vishal_ghaw...@persistent.co.in
Hi,
I am new to map-reduce programming model ,
I am writing a MR that will process the log file and results
Hi, hadoop is normally designed to write to disk. There are a special file
format, which writes output to RAM instead of disk.
But I don't have an idea if it's what you're looking for.
If what you said exists, there should be a mechanism which sends output as
objects rather than file content
Since every file name is different, you have a unique key for each map
output.
That means, every iterator has only one element. So you won't need to search
for a given name.
But it's possible that I misunderstood you.
2009/4/2 Vishal Ghawate vishal_ghaw...@persistent.co.in
Hi ,
I just wanted
Hi, Sim,
I've two suggessions, if you haven't done yet:
1. Check if your other hosts can ssh to master.
2. Take a look at logs of other hosts.
2009/4/2 Puri, Aseem aseem.p...@honeywell.com
Hi
I have a small Hadoop cluster with 3 machines. One is my
NameNode/JobTracker +
Yes, we've constructed a local version of a hadoop process,
We needed 500 input files in hadoop to reach the speed of local process,
total time was 82 seconds in a cluster of 6 machines.
And I think it's a good performance among other distributed processing
systems.
2009/4/2 jason hadoop
I think it's about that you have no right to access to the path you define.
Did you try it with a path under your user directory?
You can change permissions from console.
2009/4/1 Nagaraj K nagar...@yahoo-inc.com
Hi,
I am trying to do a side-effect output along with the usual output from the
Yes, as an additional info,
you can use this code just to start the job, not wait until it's finished:
JobClient client = new JobClient(conf);
client.runJob(conf);
2009/4/1 javateck javateck javat...@gmail.com
you can run from java program:
JobConf conf = new
There is also a good alternative,
We use ObjectInputFormat and ObjectRecordReader.
With it you can easily do File - Object translations.
I can send a code sample to your mail if you want.
If performance is important to you, Look at the quote from a previous
thread:
HDFS is a file system for distributed storage typically for distributed
computing scenerio over hadoop. For office purpose you will require a SAN
(Storage Area Network) - an architecture to attach remote computer
I doubt If I understood you correctly, but if so, there is a previous
thread to better understand what hadoop is intended to be, and what
disadvantages it has:
http://www.nabble.com/Using-HDFS-to-serve-www-requests-td22725659.html
2009/4/2 Rasit OZDAS rasitoz...@gmail.com
If performance
It seems that either NameNode or DataNode is not started.
You can take a look at log files, and paste related lines here.
2009/3/29 deepya m_dee...@yahoo.co.in:
Thanks,
I have another doubt.I just want to run the examples and see how it works.I
am trying to copy the file from local file
Two quotes for this problem:
Streaming map tasks should have a map_input_file environment
variable like the following:
map_input_file=hdfs://HOST/path/to/file
the value for map.input.file gives you the exact information you need.
(didn't try)
Rasit
2009/3/26 Jason Fennell jdfenn...@gmail.com:
Just to inform, we installed v.0.21.0-dev and there is no such issue now.
2009/3/6 Rasit OZDAS rasitoz...@gmail.com
So, is there currently no solution to my problem?
Should I live with it? Or do we have to have a JIRA for this?
What do you think?
2009/3/4 Nick Cen cenyo...@gmail.com
Hi,
I try to start balancer from API
(org.apache.hadoop.hdfs.server.balancer.Balancer.main() ), but I get
NullPointerException.
09/03/23 15:17:37 ERROR dfs.Balancer: java.lang.NullPointerException
at org.apache.hadoop.dfs.Balancer.run(Balancer.java:1453)
at
Some parameters are global (I can't give an example now),
they are cluster-wide even if they're defined in hadoop-site.xml
Rasit
2009/3/9 Nick Cen cenyo...@gmail.com
for Q1: i think so , but i think it is a good practice to keep the
hadoop-default.xml untouched.
for Q2: i use this property
Hi, all!
I'm using multiple output format to write out 4 different files, each one
has the same type.
But it seems that outputs aren't being sorted.
Should they be sorted? Or isn't it implemented for multiple output format?
Here is some code:
// in main function
Owen, I tried this, it doesn't work.
I doubt if static singleton method will work either,
since it's much or less the same.
Rasit
2009/3/2 Owen O'Malley omal...@apache.org
On Mar 2, 2009, at 3:03 AM, Tom White wrote:
I believe the static singleton approach outlined by Scott will work
So, is there currently no solution to my problem?
Should I live with it? Or do we have to have a JIRA for this?
What do you think?
2009/3/4 Nick Cen cenyo...@gmail.com
Thanks, about the Secondary Sort, can you provide some example. What does
the intermediate keys stands for?
Assume I have
Amit, it's not used here in this example, but it has other uses.
As I needed, you can pass in the name of input file as key, for example.
Rasit
2009/3/1 Kumar, Amit H. ahku...@odu.edu
A very Basic Question:
Form the WordCount example below: I don't see why do we need the
LongWritable key
Strange, that I've last night tried 1 input files (maps), waiting time
after maps increases (probably linearly)
2009/3/2 Rasit OZDAS rasitoz...@gmail.com
I have 6 reducers, Nick, still no luck..
2009/3/2 Nick Cen cenyo...@gmail.com
how many reducer do you have? You should make
Qiang,
I couldn't find now which one, but there is a JIRA issue about
MultipleTextOutputFormat (especially when reducers = 0).
If you have no reducers, you can try having one or two, then you can see if
your problem is related with this one.
Cheers,
Rasit
2009/2/25 ma qiang maqiang1...@gmail.com
Hadoop uses RMI for file copy operations.
Clients listen port 50010 for this operation.
I assume, it's sending the file as byte stream.
Cheers,
Rasit
2009/2/23 Bing TANG whutg...@gmail.com
Hi, everyone,
Could somdone tell me the principle of -file when using Hadoop
Streaming. I want to ship
Erik, did you correctly placed ports in properties window?
Port 9000 under Map/Reduce Master on the left, 9001 under DFS Master on
the right.
2009/2/19 Erik Holstad erikhols...@gmail.com
Thanks guys!
Running Linux and the remote cluster is also Linux.
I have the properties set up like that
Zander,
I've looked at my datanode logs on the slaves, but they are all in quite
small sizes, although we've run many jobs on them.
And running 2 new jobs also didn't add anything to them.
(As I understand from the contents of the logs, hadoop logs especially
operations about DFS performance
Philipp, I have no problem running jobs locally with eclipse (via hadoop
plugin) and observing it from browser.
(Please note that jobtracker page doesn't refresh automatically, you need to
refresh it manually.)
Cheers,
Rasit
2009/2/19 Philipp Dobrigkeit pdobrigk...@gmx.de
When I start my job
Erik,
Try to add following properties into hadoop-site.xml:
property
namefs.default.name/name
valuehdfs://ip_address:9000/value
/property
property
namemapred.job.tracker/name
valuehdfs://ip_address:9001/value
Nicholas, like Matei said,
There is 2 possibility in terms of permissions:
(any permissions command is just-like in linux)
1. Create a directory for a user. Make the user owner of that directory:
hadoop dfs -chown ... (assuming hadoop doesn't need to have write access to
any file outside user's
Hi,
There is a JIRA issue about this problem, if I understand it correctly:
https://issues.apache.org/jira/browse/HADOOP-3743
Strange, that I searched all source code, but there exists only this control
in 2 places:
if (!(job.getBoolean(mapred.used.genericoptionsparser, false))) {
John, did you try -D option instead of -jobconf,
I had -D option in my code, I changed it with -jobconf, this is what I get:
...
...
Options:
-inputpath DFS input file(s) for the Map step
-output path DFS output directory for the Reduce step
-mapper cmd|JavaClassName
On Wed, Feb 18, 2009 at 9:14 AM, Rasit OZDAS rasitoz...@gmail.com wrote:
John, did you try -D option instead of -jobconf,
I had -D option in my code, I changed it with -jobconf, this is what I
get:
...
...
Options:
-inputpath DFS input file(s) for the Map step
-output
Stefan and Thibaut, are you using MultipleOutputFormat, and how many
reducers do you have?
if you're using MultipleOutputFormat and have no reducer, there is a JIRA
ticket about this issue.
https://issues.apache.org/jira/browse/HADOOP-5268
Or there is a different JIRA issue (it's not resolved
I agree with Amandeep, and results will remain forever, unless you manually
delete them.
If we are on the right road,
change hadoop.tmp.dir property to be outside of /tmp, or changing
dfs.name.dir and dfs.data.dir should be enough for basic use (I didn't have
to change anything else).
Cheers,
:D
Then I found out that there is 3 similar issue about this problem :D
Quite useful information, isn't it? ;)
2009/2/17 Thibaut_ tbr...@blue.lu
Hello Rasi,
https://issues.apache.org/jira/browse/HADOOP-5268 is my bug report.
Thibaut
--
View this message in context:
Nartan, If you're using BytesWritable, I've heard that it doesn't return
only valid bytes, it actually returns more than that.
Here is this issue discussed:
http://www.nabble.com/can%27t-read-the-SequenceFile-correctly-td21866960.html
Cheers,
Rasit
2009/2/18 Nathan Marz nat...@rapleaf.com
,
Jeff
On Tue, Feb 10, 2009 at 5:05 AM, Rasit OZDAS rasitoz...@gmail.com wrote:
Hi,
We have thousands of files, each dedicated to a user. (Each user has
access to other users' files, but they do this not very often.)
Each user runs map-reduce jobs on the cluster.
So we should seperate his/her
Sandy, as far as I remember, there were some threads about the same
problem (I don't know if it's solved). Searching the mailing list for
this error: could only be replicated to 0 nodes, instead of 1 may
help.
Cheers,
Rasit
2009/2/16 Sandy snickerdoodl...@gmail.com:
just some more information:
Yes, I've tried the long solution;
when I execute ./hadoop dfs -put ... from a datanode,
in any case 1 copy gets written to that datanode.
But I think I should use SSH for this,
Anybody knows a better way?
Thanks,
Rasit
2009/2/16 Rasit OZDAS rasitoz...@gmail.com:
Thanks, Jeff.
After
Sandy, I have no idea about your issue :(
Zander,
Your problem is probably about this JIRA issue:
http://issues.apache.org/jira/browse/HADOOP-1212
Here is 2 workarounds explained:
Kris,
This is the case when you have only 1 reducer.
If it doesn't have any side effects for you..
Rasit
2009/2/14 Kris Jirapinyo kjirapi...@biz360.com:
Is there a way to tell Hadoop to not run Map and Reduce concurrently? I'm
running into a problem where I set the jvm to Xmx768 and it seems
I agree with Amar and James,
if you require permissions for your project,
then
1. create a group in linux for your user.
2. give group write access to all files in HDFS. (hadoop dfs -chmod -R
g+w / - or sth, I'm not totally sure.)
3. change group ownership of all files in HDFS. (hadoop dfs
With this configuration, any user having that group name will be able
to write to any location..
(I've tried this in local network, though)
2009/2/14 Rasit OZDAS rasitoz...@gmail.com:
I agree with Amar and James,
if you require permissions for your project,
then
1. create a group in linux
Hi, Andy
Your problem seems to be a general Java problem, rather than hadoop.
In a java forum you may get better help.
String.split uses regular expressions, which you definitely don't need.
I would write my own split function, without regular expressions.
This link may help to better understand
Yes, version 18.3 is the most stable one. It has added patches,
without not-proven new functionality.
2009/2/11 Owen O'Malley omal...@apache.org:
On Feb 10, 2009, at 7:21 PM, Vadim Zaliva wrote:
Maybe version 0.18
is better suited for production environment?
Yahoo is mostly on 0.18.3 +
I have also the same problem.
It would be wonderful if someone has some info about this..
Rasit
2009/2/10 Mimi Sun m...@rapleaf.com:
I see UnsatisfiedLinkError. Also I'm calling
System.getProperty(java.library.path) in the reducer and logging it. The
only thing that prints out is
Hi, Mark
Try to add an extra property to that file, and try to examine if
hadoop recognizes it.
This way you can find out if hadoop uses your configuration file.
2009/2/10 Jeff Hammerbacher ham...@cloudera.com:
Hey Mark,
In NameNode.java, the DEFAULT_PORT specified for NameNode RPC is 8020.
Hi,
We have thousands of files, each dedicated to a user. (Each user has
access to other users' files, but they do this not very often.)
Each user runs map-reduce jobs on the cluster.
So we should seperate his/her files equally across the cluster,
so that every machine can take part in the
Hi, Amandeep,
I've copied following lines from a site:
--
Exception in thread main java.lang.OutOfMemoryError: Java heap space
This can have two reasons:
* Your Java application has a memory leak. There are tools like
YourKit Java Profiler that help you to identify such leaks.
*
Hi, Mithila,
File /user/mithila/test/20417.txt could only be replicated to 0
nodes, instead of 1
I think your datanode isn't working properly.
please take a look at log file of your datanode (logs/*datanode*.log).
If there is no error in that log file, I've heard that hadoop can sometimes mark
Mark,
http://stuartsierra.com/2008/04/24/a-million-little-files/comment-page-1
In this link, there is a tool to create sequence files from tar.gz and
tar.bz2 files.
I don't think that this is a real solution, but at least it means more
free memory and delay of problems (worst solution).
Rasit
I can add a little method to follow namenode failures,
I find out such problems by running first start-all.sh , then stop-all.sh
if namenode starts without error, stop-all.sh gives the output
stopping namenode.. , but in case of an error, it says no namenode
to stop..
In case of an error,
Forgot to say, value 0 means that the requested counter does not exist.
2009/2/5 Rasit OZDAS rasitoz...@gmail.com:
Sharath,
I think the static enum definition should be out of Reduce class.
Hadoop probably tries to find it elsewhere with MyCounter, but it's
actually Reduce.MyCounter in your
Sharath,
I think the static enum definition should be out of Reduce class.
Hadoop probably tries to find it elsewhere with MyCounter, but it's
actually Reduce.MyCounter in your example.
Hope this helps,
Rasit
2009/2/5 some speed speed.s...@gmail.com:
I Tried the following...It gets compiled
at 8:04 AM, Rasit OZDAS rasitoz...@gmail.com wrote:
Forgot to say, value 0 means that the requested counter does not exist.
2009/2/5 Rasit OZDAS rasitoz...@gmail.com:
Sharath,
I think the static enum definition should be out of Reduce class.
Hadoop probably tries to find it elsewhere
Rajshekar,
It seems that your namenode isn't able to load FsImage file.
Here is a thread about a similar issue:
http://www.nabble.com/Hadoop-0.17.1-%3D%3E-EOFException-reading-FSEdits-file,-what-causes-this---how-to-prevent--td21440922.html
Rasit
2009/2/5 Rajshekar rajasheka...@excelindia.com:
Ian,
here is a list under
Setting up Hadoop on a single node Basic Configuration Jobtracker
and Namenode settings
Maybe it's what you're looking for.
Cheers,
Rasit
2009/2/4 Ian Soboroff ian.sobor...@nist.gov:
I would love to see someplace a complete list of the ports that the various
Hadoop
)
at java.util.jar.JarFile.init(JarFile.java:87)
at org.apache.hadoop.util.RunJar.main(RunJar.java:88)
... 4 more
Pls help me out
Rasit OZDAS wrote:
Rajshekar,
It seems that your namenode isn't able to load FsImage file.
Here is a thread about a similar issue:
http://www.nabble.com/Hadoop
I tried it myself, it doesn't work.
I've also tried stream.map.output.field.separator and
map.output.key.field.separator parameters for this purpose, they
don't work either. When hadoop sees empty string, it takes default tab
character instead.
Rasit
2009/2/4 jason hadoop
John, I also couldn't find a way from console,
Maybe you already know and don't prefer to use, but API solves this problem.
FileSystem.copyFromLocalFile(boolean delSrc, boolean overwrite, Path
src, Path dst)
If you have to use console, long solution, but you can create a jar
for this, and call it
Amandeep,
SQL command not properly ended
I get this error whenever I forget the semicolon at the end.
I know, it doesn't make sense, but I recommend giving it a try
Rasit
2009/2/4 Amandeep Khurana ama...@gmail.com:
The same query is working if I write a simple JDBC client and query the
Hi,
I tried to use SequenceFileInputFormat, for this I appended SEQ as first
bytes of my binary files (with hex editor).
but I get this exception:
A record version mismatch occured. Expecting v6, found v32
at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1460)
at
...
Thanks for any points..
Rasit
2009/2/2 Rasit OZDAS rasitoz...@gmail.com
Hi,
I tried to use SequenceFileInputFormat, for this I appended SEQ as first
bytes of my binary files (with hex editor).
but I get this exception:
A record version mismatch occured. Expecting v6, found v32
in the array (see the write()
method in BytesWritable).
Hope this helps.
Tom
On Mon, Feb 2, 2009 at 3:21 PM, Rasit OZDAS rasitoz...@gmail.com wrote:
I tried to use SequenceFile.Writer to convert my binaries into Sequence
Files,
I read the binary data with FileInputStream, getting all bytes
Thanks for responses, the problem is solved :)
I'll be forwarding the thread to my colleagues.
2009/1/29 nitesh bhatia niteshbhatia...@gmail.com
HDFS is a file system for distributed storage typically for distributed
computing scenerio over hadoop. For office purpose you will require a SAN
Oh, I can't believe, my problem was the same, I thought last one was an
answer to my thread.
Who cares, the problem is solved, thanks!
2009/1/29 Rasit OZDAS rasitoz...@gmail.com
Thanks for responses, the problem is solved :)
I'll be forwarding the thread to my colleagues.
2009/1/29 nitesh
of the linux server distributions (eg. RHEL, SuSE) or Solaris (ZFS +
zones) or perhaps best plug-n-play solution (non-open-source) would be a Mac
Server + XSan.
--nitesh
Thanks,
Rasit
2009/1/28 Rasit OZDAS rasitoz...@gmail.com
Thanks for responses,
Sorry, I made a mistake, it's actually
Do you mean, without scanning all the files line by line?
I know little about implementation of hadoop, but as a programmer, I can
presume that it's not possible without a complete scan.
But I can suggest a work-around:
- compute number of records manually before putting a file to HDFS.
- Append
Both DFS viewer and job submission work on eclipse v. 3.3.2.
I've given up using Ganymede, unfortunately..
2009/1/26 Aaron Kimball aa...@cloudera.com
The Eclipse plugin (which, btw, is now part of Hadoop core in src/contrib/)
currently is inoperable. The DFS viewer works, but the job
on top of HDFS
and
provides random access.
JG
-Original Message-
From: Rasit OZDAS [mailto:rasitoz...@gmail.com]
Sent: Tuesday, January 27, 2009 1:20 AM
To: core-user@hadoop.apache.org
Cc: arif.yil...@uzay.tubitak.gov.tr; emre.gur...@uzay.tubitak.gov.tr;
hilal.tara
Hi,
I wanted to ask, if HDFS is a good solution just as a distributed db (no
running jobs, only get and put commands)
A review says that HDFS is not designed for low latency and besides, it's
implemented in Java.
Do these disadvantages prevent us using it?
Or could somebody suggest a better
Hi Tien,
Configuration config = new Configuration(true);
config.addResource(new Path(/etc/hadoop-0.19.0/conf/hadoop-site.xml));
FileSystem fileSys = FileSystem.get(config);
BlockLocation[] locations = fileSys.getFileBlockLocations(.
I copied some lines of my code, it can also help if you
Hi,
Try to use:
conf.setJarByClass(EchoOche.class); // conf is the JobConf instance of your
example.
Hope this helps,
Rasit
2009/1/20 Shyam Sarkar shyam.s.sar...@gmail.com
Hi,
I was trying to run Hadoop wordcount version 2 example under Cygwin. I
tried
without pattern.txt file -- It
I would prefer catching the EOFException in my own code,
assuming you are happy with the output before exception occurs.
Hope this helps,
Rasit
2009/1/16 Konstantin Shvachko s...@yahoo-inc.com
Joe,
It looks like you edits file is corrupted or truncated.
Most probably the last modification
Jim,
As far as I know, there is no operation done after Reducer.
At the first look, the situation reminds me of same keys for all the tasks,
This can be the result of one of following cases:
- input format reads same keys for every task.
- mapper collects every incoming key-value pairs under same
Hi Alyssa,
http://markmail.org/message/jyo4wssouzlb4olm#query:%22Decommission%20of%20datanodes%22+page:1+mid:p2krkt6ebysrsrpl+state:results
as pointed here, decommission (removal) of datanodes was not an easy job at
the date of version 0.12.
I strongly think it's still not easy.
As far as I know,
83 matches
Mail list logo