Hi, Chris,
Thanks for adding the map side join feature
(http://issues.apache.org/jira/browse/HADOOP-2085)
I tried the join example with KeyValueTextInputFormat as input format, but got
following exception:
java.lang.NullPointerException
at
org.apache.hadoop.mapred.KeyValu
Chris,
I have been meaning to write to you.
Have you seen my grool system which allows simple MR programs to be written
simply?
I have been thinking for some time that it would make a good match with
Cascades.
In addition, I have been working on a layer over Zookeeper to handle
collection of d
Hey all
Just wanted to let interested parties know we just released 0.1.0 of
our Groovy 'builder' extension.
We think this will be a great tool for those groups that need to
expose Hadoop to the 'casual' user who needs to get and manipulate
valuable data on a Hadoop cluster, but doesn't h
I got it committed. Thank you, Spiros!
Hairong
On 5/4/08 10:52 AM, "Spiros Papadimitriou" <[EMAIL PROTECTED]> wrote:
> Hi,
>
> Those two scripts assume that the bin directory is in the path, unlike all
> other scripts.
>
> I opened a JIRA (https://issues.apache.org/jira/browse/HADOOP-2930) an
Keep in mind that many applications can do without real append if they don't
have massive reliability requirements. Just accumulate data on the side and
burp it into HDFS periodically. Then on some longer time scale accumulate
your data burps into a full sized data belch. The cost is surprising
Yeah...
Submit code.
Failing that, help design the code.
On 5/5/08 1:03 PM, "vikas" <[EMAIL PROTECTED]> wrote:
> Is there any thing which I can do to raise its priority.
Thank you very much for the right link. It really helped. As many others
even I'm waiting for
"Append to files in HDFS"
Is there any thing which I can do to raise its priority. Does HADOOP
Developer community is tracking any request counter for a particular
feature to raise ones priority. if that
Maneesha Jain wrote:
I'm looking for any documentation or javadoc for MiniDFSCluster and have not
been able to find it anywhere.
Can someone please point me to it.
http://svn.apache.org/repos/asf/hadoop/core/trunk/src/test/org/apache/hadoop/dfs/MiniDFSCluster.java
This is part of the test cod
Hi,
I'm looking for any documentation or javadoc for MiniDFSCluster and have not
been able to find it anywhere.
Can someone please point me to it.
regards
maneesha
Awesome. Thanks for the replies. Do you mind sharing your code or providing
high-level details on the implementation?
- Original Message
From: Jason Venner <[EMAIL PROTECTED]>
To: core-user@hadoop.apache.org
Sent: Monday, May 5, 2008 12:41:26 PM
Subject: Re: Query against different dat
We do this all the time.
In one case we have the mapper work out the input type by examining the
input file name and the record data. We tend to do this for the textual
keyTABvalue records
In another case we have a container object that can hold any writable,
that we pass around. We do this f
On May 4, 2008, at 6:27 PM, vikas wrote:
Hi All,
I was looking for, how multiple inputs can be written to same
output that
too at different intervals of time ( ie.. I want to re-open the
same file to
append data to it )
This link did not contain any thing related to my Q.
http://issues.a
You just have to write an adapted input format that reads multiple kinds of
input.
It can key off the contents of the file or the name. Depending on names is
bad, but has a long lineage so people tend to deal with it reasonably well.
It isn't very hard to write.
-Original Message-
Fr
Has anyone come across this scenario and if not, does anyone have any
suggestions?
What if you store different types of data within HDFS. You store XML, text,
binary, sequence files, etc. You now want to run a query against ALL of the
data stored within HDFS via a map/reduce job. How do you
14 matches
Mail list logo