Thanks everyone,
I detailed describe my question. There are two input
direcoties:/user/test1/ and /user/test2/ path, I want to join the two
direcoties content, in order to join the two directories, I need to identity
the content from which directory, so I use below code in mapper:
private int tag = -1;
@Override
public void configure(JobConf conf) {
try {
this.conf = conf;
String pathsToAliasStr = conf.get("paths.to.alias");//example:
conf.set("paths.to.alias", "0=/user/test1/,1=/user/test2/"
String[] pathsToAlias = pathsToAliasStr.split(",");
Path fpath = new Path((new
Path(conf.get("map.input.file"))).toUri().getPath());
String path = fpath.toUri().toString();
for (int i = 0; i < pathsToAlias.length; i++) {
String[] pathToAlias = pathsToAlias[i].split("=");
if (path.startsWith(pathToAlias[1])) {
tag = Integer.valueOf(pathToAlias[0].trim());//identity
current map instatnce are handling which directory content.
}
}
} catch (Throwable e) {
e.printStackTrace();
throw new RuntimeException(e);
}
}
So when map method run, the content are handled by the mapper are
identified for same direcoty.
I want to know whether one mapper instatnce only handle content of one
directory at same time.
Thanks
LiuLei
2011/1/21 Eric Sammer <[email protected]>
> LiuLei:
>
> Yes. What you're looking for is TextInputFormat.addPath() (assuming you're
> talking about text). You can call this multiple times and add multiple
> input
> paths if they are all of the same data format (i.e. text). If you have
> multiple paths that contain different format data, you'll need to use
> MultipleInputs. See the javadoc for details on usage.
>
> On Thu, Jan 20, 2011 at 1:52 AM, lei liu <[email protected]> wrote:
>
> > There are two input paths, example: /user/test1/ and /user/test2/ path.
> > Can
> > one map instance handle many data of input paths at the same time?
> >
> >
> > Thanks,
> >
> > LiuLei
> >
>
>
>
> --
> Eric Sammer
> twitter: esammer
> data: www.cloudera.com
>