Dear All,
Hi, my name is Taeho and I am trying to figure out the maximum number of
files a namenode can hold.
The main reason for doing this is that I want to have some estimates on how
many files I can put into the HDFS without overflowing the Namenode
machine's memory.
I know the number
We came across an issue where our jobs failed to report back to the
tracker. (https://issues.apache.org/jira/browse/HADOOP-1790) Now we
are getting a little bit further and the map-phase is working just
fine but the reduce seems to be just stuck at 0%. We are see the
following in the logs:
Taeho Kang wrote:
Hello Sameer. Thank you for your useful link. It's been very helpful!
By the way, our Hadoop cluster has a namenode with 4GBytes of RAM.
Based on the analysis found in the HADOOP-1687 (
http://issues.apache.org/jira/browse/HADOOP-1687), we could probably state
that for
Taeho Kang wrote:
Hello Sameer. Thank you for your useful link. It's been very helpful!
By the way, our Hadoop cluster has a namenode with 4GBytes of RAM.
Based on the analysis found in the HADOOP-1687 (
http://issues.apache.org/jira/browse/HADOOP-1687), we could probably
state that for
I think the following will do what you want.
t1 = load table1 as id, listOfId;
t2 = load table2 as id, f1;
t1a = foreach t1 generate flatten(listOfId); -- flattens the lisOfId
into a set of ids
b = join t1a by $0, t2 by id; -- join the two together.
c = foreach b generate t2.id, t2.f1; --
Will it?
Trying an example:
t1 = {1, 2, 3, 4}
t2 = {2, alpha,3,beta,4,gamma}
desired outcome c = {1, alpha, beta, gamma} /* or alternatively
*/
c = {1, 2,alpha,3,beta,4,gamma}
but as proposed (I hope I am reading the pig document correctly):
t1a = {2,3,4}
b = {2, 2, alpha}
//
Sorry, I misunderstood what you were trying to generate. Perhaps the
following will come closer:
t1 = load table1 as id, listOfId; -- 1, 2,3,4
t2 = load table2 as id, f1; -- 2,a,3,b,4,c
a = foreach t1 generate id, flatten(listOfId); -- 1,2,1,3,1,4
b = join a by $0, t2 by id; --
Does anyone have any ideas on this issue?
Otherwise, if I were to write a patch to add this option for jobs to Hadoop,
would it be useful for anyone else?
Thanks
Stu
-Original Message-
From: Stu Hood [EMAIL PROTECTED]
Sent: Fri, August 24, 2007 9:43 am
To:
I think this is related to HADOOP-1558:
https://issues.apache.org/jira/browse/HADOOP-1558
Per-job cleanups that are not run clientside must be run in a separate
JVM, since we, as a rule, don't run user code in long-lived daemons.
Doug
Stu Hood wrote:
Does anyone have any ideas on this
I would find it useful to have some sort of listener mechanism, where
you could register an object to be notified of a job completion event
and then respond to it accordingly.
Matt
On 8/28/07, Stu Hood [EMAIL PROTECTED] wrote:
Does anyone have any ideas on this issue?
Otherwise, if I were
Matt Kent wrote:
I would find it useful to have some sort of listener mechanism, where
you could register an object to be notified of a job completion event
and then respond to it accordingly.
There is a job completion notification feature.
property
namejob.end.notification.url/name
I don't think the secondary namenodes are working throughout - so not
sure they are a factor.
What I observed:
- stopped dfs. took backup copy of current/ directory
- restarted dfs with new 0.13.1
- after file system is back up - fsck says fs is corrupt. Large number
of files have blocks
Hi Michael,
thanks for the detailed answer, it has been helpful (especially the
log4j DEBUG level for all that classes).
Check the logs to see if you can get a clue as to what is going on. Did
the cluster HMaster get the shutdown signal? (Is it running the
shutdown sequence?) Logs are in
Michele Catasta wrote:
Hi Michael,
thanks for the detailed answer, it has been helpful (especially the
log4j DEBUG level for all that classes).
Check the logs to see if you can get a clue as to what is going on. Did
the cluster HMaster get the shutdown signal? (Is it running the
I am misunderstanding something.
following intro to pig-latin doc (p6), the flatten generating 'a' would
generate 1,2,3,4 (and not 1,2,1,3,1,4)
-Original Message-
From: Alan Gates [mailto:[EMAIL PROTECTED]
Sent: Tuesday, August 28, 2007 12:47 PM
To: hadoop-user@lucene.apache.org
Cc:
Hi,
There are 2 different data types in Pig
i) Tuple: a collection of fields, like a database record
ii) Bag: collection of tuples, like a database table.
In,
t1 = load table1 as id, listOfId;
If listOfId is a bag, flattening will give you
1, 2
1, 3
1, 4
If listOfId is a tuple, flattening
16 matches
Mail list logo