Oh, I see. I was assuming a local build of Hadoop snapshot installed into the local cache.
On Wednesday, February 11, 2015, Colin P. McCabe <[email protected]> wrote: > On Wed, Feb 11, 2015 at 11:27 AM, Nick Dimiduk <[email protected] > <javascript:;>> wrote: > > I don't recall the hadoop release repo restriction being a problem, but I > > haven't tested it lately. See if you can just specify the release version > > with -Dhadoop.version or -Dhadoop-two.version. > > > > Sorry, it's been a while since I did this... I guess the question is > whether 2.7.0-SNAPSHOT is available in Maven-land somewhere? If so, > then Chunxu should forget all that stuff I said, and just build HBase > with -Dhadoop.version=2.7.0-SNAPSHOT > > > I would go against branch-1.0 as this will be the eminent 1.0.0 release > and > > had HTrace 3.1.0-incubating. > > Thanks. > > Colin > > > > > > -n > > > > On Wed, Feb 11, 2015 at 11:13 AM, Colin P. McCabe <[email protected] > <javascript:;>> > > wrote: > > > >> Thanks for trying stuff out! Sorry that this is a little difficult at > >> the moment. > >> > >> To really do this right, you would want to be using Hadoop with HTrace > >> 3.1.0, and HBase with HTrace 3.1.0. Unfortunately, there hasn't been > >> a new release of Hadoop with HTrace 3.1.0. The only existing releases > >> of Hadoop use an older version of the HTrace library. So you will > >> have to build from source. > >> > >> If you check out Hadoop's "branch-2" branch (currently, this branch > >> represents what will be in the 2.7 release, when it is cut), and build > >> that, you will get the latest. Then you have to build a version of > >> HBase against the version of Hadoop you have built. > >> > >> By default, HBase's Maven build will build against upstream release > >> versions of Hadoop only. So just setting > >> -Dhadoop.version=2.7.0-SNAPSHOT is not enough, since it won't know > >> where to find the jars. To get around this problem, you can create > >> your own local maven repo. Here's how. > >> > >> In hadoop/pom.xml, add these lines to the distributionManagement stanza: > >> > >> + <repository> > >> + <id>localdump</id> > >> + <url>file:///home/cmccabe/localdump/releases</url> > >> + </repository> > >> + <snapshotRepository> > >> + <id>localdump</id> > >> + <url>file:///home/cmccabe/localdump/snapshots</url> > >> + </snapshotRepository> > >> > >> Comment out the repositories that are already there. > >> > >> Now run mkdir /home/cmccabe/localdump. > >> > >> Then, in your hadoop tree, run mvn deploy -DskipTests. > >> > >> You should get a localdump directory that has files kind of like this: > >> > >> ... > >> /home/cmccabe/localdump/snapshots/org/apache/hadoop > >> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce > >> > >> > /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/maven-metadata.xml.md5 > >> > >> > /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT > >> > >> > /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml.md5 > >> > >> > /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/hadoop-mapreduce-2.7.0-20121120.230341-1.pom.sha1 > >> > >> > /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml > >> ... > >> > >> Now, add the following lines to your HBase pom.xml: > >> > >> <repositories> > >> <repository> > >> + <id>localdump</id> > >> + <url>file:///home/cmccabe/localdump</url> > >> + <name>Local Dump</name> > >> + <snapshots> > >> + <enabled>true</enabled> > >> + </snapshots> > >> + <releases> > >> + <enabled>true</enabled> > >> + </releases> > >> + </repository> > >> + <repository> > >> > >> This will allow you to run something like: > >> mvn test -Dtest=TestMiniClusterLoadSequential -PlocalTests > >> -DredirectTestOutputToFile=true -Dhadoop.profile=2.0 > >> -Dhadoop.version=2.7.0-SNAPSHOT -Dcdh.hadoop.version=2.7.0-SNAPSHOT > >> > >> Once we do a new release of Hadoop with HTrace 3.1.0 this will get a lot > >> easier. > >> > >> Related: Does anyone know what the best git branch to build from for > >> HBase would be for this kind of testing? I've been meaning to do some > >> end to end testing (it's been on my TODO for a while) > >> > >> best, > >> Colin > >> > >> On Wed, Feb 11, 2015 at 7:55 AM, Chunxu Tang <[email protected] > <javascript:;>> wrote: > >> > Hi all, > >> > > >> > Now I’m exploiting HTrace to trace request level data flows in HBase > and > >> > HDFS. I have successfully traced HBase and HDFS by using HTrace, > >> > respectively. > >> > > >> > After that, I combine HBase and HDFS together and I want to just send > a > >> > PUT/GET request to HBase, but to trace the whole data flow in both > HBase > >> > and HDFS. In my opinion, when I send a request such as Get to HBase, > it > >> > will at last try to read the blocks on HDFS, so I can construct a > whole > >> > data flow tracing through HBase and HDFS. While, the fact is that I > can > >> > only get tracing data of HBase, with no data of HDFS. > >> > > >> > Could you give me any suggestions on how to trace the data flow in > both > >> > HBase and HDFS? Does anyone have similar experience? Do I need to > modify > >> > the source code? And maybe which part(s) should I touch? If I need to > >> > modify the code, I will try to create a patch for that. > >> > > >> > Thank you. > >> > > >> > My Configurations: > >> > Hadoop version: 2.6.0 > >> > HBase version: 0.99.2 > >> > HTrace version: htrace-master > >> > OS: Ubuntu 12.04 > >> > > >> > > >> > Joshua > >> >
