So I read some more through the Javadocs. I had 11 reducers on my original job
leaving me 11 MapFile directories. I am passing in their parent directory here
as "outDir".
MapFile.Reader[] readers = MapFileOutputFormat.getReaders(fileSys, outDir,
defaults);
Partitioner part =
(Partitioner)ReflectionUtils.newInstance(conf.getPartitionerClass(), conf);
Text entryValue = (Text)MapFileOutputFormat.getEntry(readers, part, new
Text("mykey"), null);
System.out.println("My Entry's Value: ");
System.out.println(entryValue.toString());
But I am getting an exception:
Exception in thread "main" java.lang.ArithmeticException: / by zero
at
org.apache.hadoop.mapred.lib.HashPartitioner.getPartition(HashPartitioner.java:35)
at
org.apache.hadoop.mapred.MapFileOutputFormat.getEntry(MapFileOutputFormat.java:85)
at mypackage.MyClass.main(ProfileReader.java:110)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at org.apache.hadoop.util.RunJar.main(RunJar.java:155)
I am assuming I am doing something wrong, but I'm not sure what it is yet. Any
ideas?
-Xavier
-----Original Message-----
From: Xavier Stevens
Sent: Mon 3/10/2008 3:49 PM
To: [email protected]
Subject: RE: What's the best way to get to a single key?
I was thinking because it would be easier to search a single-index.
Unless I don't have to worry and hadoop searches all my indexes at the
same time. Is this the case?
-Xavier
-----Original Message-----
From: Doug Cutting [mailto:[EMAIL PROTECTED]
Sent: Monday, March 10, 2008 3:45 PM
To: [email protected]
Subject: Re: What's the best way to get to a single key?
Xavier Stevens wrote:
> Thanks for everything so far. It has been really helpful. I have one
> more question. Is there a way to merge MapFile index/data files?
No.
To append text files you can use 'bin/hadoop fs -getmerge'.
To merge sorted SequenceFiles (like MapFile/index files) you can use:
http://hadoop.apache.org/core/docs/current/api/org/apache/hadoop/io/Sequ
enceFile.Sorter.html#merge(org.apache.hadoop.fs.Path[], org.apache.had
oop.fs.Path, boolean)
But this doesn't generate a MapFile.
Why is a single file preferable?
Doug