Interesting lead, thanks. Meanwhile, I was also thinking of using distcp. With the help of hftp we can overcome the Hadoop mismatch issue as well but I think the mismatch in security configuration will still be a problem. I tried it as the following, where the source has Kerberos configured and the destination didn't but it failed with the exception. This was kicked-off form the destination server of course.
hadoop distcp hftp://<dfs.http.address>/<path>hdfs://<dfs.http.address>/<path> org.apache.hadoop.ipc.RemoteException(java.io.IOException): Security enabled but user not authenticated by filter at org.apache.hadoop.ipc.RemoteException.valueOf(RemoteException.java:97) at org.apache.hadoop.hdfs.HftpFileSystem$LsParser.startElement(HftpFileSyste... Regards, Shahab On Sat, Apr 27, 2013 at 2:51 AM, Damien Hardy <[email protected]> wrote: > Hello > > Maybe you should look at export tools source code as it can export HBase > data to distant HDFS space (setting a full hdfs:// url in command line > option for outputdir) > > > https://github.com/apache/hbase/blob/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Export.java > > Cheers, > > > 2013/4/27 Shahab Yunus <[email protected]> > > > Thanks Ted for the response. But the issue is that I want to read from > one > > cluster and write to another. If I will have to clients then how will > they > > communicate with each other? Essentially what am I trying to do here is > > intra-cluster data copy/exchange. Any other ideas or suggestions? Even if > > both servers have no security or one has Kerberos or both have > > authentication how to exchange data between them? > > > > I was actually not expecting that I cannot load multiple Hadoop or HBase > > configurations in 2 different Configuration objects in one application. > > As mentioned I have tried overwriting properties as well but > > security/authentication properties are overwritten somehow. > > > > Regards, > > Shahab > > > > > > On Fri, Apr 26, 2013 at 7:43 PM, Ted Yu <[email protected]> wrote: > > > > > Looks like the easiest solution is to use separate clients, one for > each > > > cluster you want to connect to. > > > > > > Cheers > > > > > > On Sat, Apr 27, 2013 at 6:51 AM, Shahab Yunus <[email protected] > > > >wrote: > > > > > > > Hello, > > > > > > > > This is a follow-up to my previous post a few days back. I am trying > to > > > > connect to 2 different Hadoop clusters' setups through a same client > > but > > > I > > > > am running into the issue that the config of one overwrites the > other. > > > > > > > > The scenario is that I want to read data from an HBase table from one > > > > cluster and write it as a file on HDFS on the other. Individually, > if I > > > try > > > > to write to them they both work but when I try this through a same > Java > > > > client, they fail. > > > > > > > > I have tried loading the core-site.xml through addResource method of > > the > > > > Configuration class but only the first found config file is picked? I > > > have > > > > also tried by renaming the config files and then adding them as a > > > resource > > > > (again through the addResource method). > > > > > > > > The situation is compounded by the fact that one cluster is using > > > Kerberos > > > > authentication and the other is not? If the Kerberos server's file is > > > found > > > > first then authentication failures are faced for the other server > when > > > > Hadoop tries to find client authentication information. If the > 'simple' > > > > cluster's config is loaded first then 'Authentication is Required' > > error > > > is > > > > encountered against the Kerberos server. > > > > > > > > I will gladly provide more information. Is it even possible even if > let > > > us > > > > say both servers have same security configuration or none? Any ideas? > > > > Thanks a million. > > > > > > > > Regards, > > > > Shahab > > > > > > > > > > > > > -- > Damien HARDY > IT Infrastructure Architect > Viadeo - 30 rue de la Victoire - 75009 Paris - France >
