Re: Trace HBase/HDFS with HTrace

Nick Dimiduk Wed, 11 Feb 2015 12:37:42 -0800

Oh, I see. I was assuming a local build of Hadoop snapshot installed into
the local cache.


On Wednesday, February 11, 2015, Colin P. McCabe <[email protected]> wrote:

> On Wed, Feb 11, 2015 at 11:27 AM, Nick Dimiduk <[email protected]
> <javascript:;>> wrote:
> > I don't recall the hadoop release repo restriction being a problem, but I
> > haven't tested it lately. See if you can just specify the release version
> > with -Dhadoop.version or -Dhadoop-two.version.
> >
>
> Sorry, it's been a while since I did this... I guess the question is
> whether 2.7.0-SNAPSHOT is available in Maven-land somewhere?  If so,
> then Chunxu should forget all that stuff I said, and just build HBase
> with -Dhadoop.version=2.7.0-SNAPSHOT
>
> > I would go against branch-1.0 as this will be the eminent 1.0.0 release
> and
> > had HTrace 3.1.0-incubating.
>
> Thanks.
>
> Colin
>
>
> >
> > -n
> >
> > On Wed, Feb 11, 2015 at 11:13 AM, Colin P. McCabe <[email protected]
> <javascript:;>>
> > wrote:
> >
> >> Thanks for trying stuff out!  Sorry that this is a little difficult at
> >> the moment.
> >>
> >> To really do this right, you would want to be using Hadoop with HTrace
> >> 3.1.0, and HBase with HTrace 3.1.0.  Unfortunately, there hasn't been
> >> a new release of Hadoop with HTrace 3.1.0.  The only existing releases
> >> of Hadoop use an older version of the HTrace library.  So you will
> >> have to build from source.
> >>
> >> If you check out Hadoop's "branch-2" branch (currently, this branch
> >> represents what will be in the 2.7 release, when it is cut), and build
> >> that, you will get the latest.  Then you have to build a version of
> >> HBase against the version of Hadoop you have built.
> >>
> >> By default, HBase's Maven build will build against upstream release
> >> versions of Hadoop only. So just setting
> >> -Dhadoop.version=2.7.0-SNAPSHOT is not enough, since it won't know
> >> where to find the jars.  To get around this problem, you can create
> >> your own local maven repo. Here's how.
> >>
> >> In hadoop/pom.xml, add these lines to the distributionManagement stanza:
> >>
> >> +    <repository>
> >> +      <id>localdump</id>
> >> +      <url>file:///home/cmccabe/localdump/releases</url>
> >> +    </repository>
> >> +    <snapshotRepository>
> >> +      <id>localdump</id>
> >> +      <url>file:///home/cmccabe/localdump/snapshots</url>
> >> +    </snapshotRepository>
> >>
> >> Comment out the repositories that are already there.
> >>
> >> Now run mkdir /home/cmccabe/localdump.
> >>
> >> Then, in your hadoop tree, run mvn deploy -DskipTests.
> >>
> >> You should get a localdump directory that has files kind of like this:
> >>
> >> ...
> >> /home/cmccabe/localdump/snapshots/org/apache/hadoop
> >> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce
> >>
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/maven-metadata.xml.md5
> >>
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT
> >>
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml.md5
> >>
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/hadoop-mapreduce-2.7.0-20121120.230341-1.pom.sha1
> >>
> >>
> /home/cmccabe/localdump/snapshots/org/apache/hadoop/hadoop-mapreduce/2.7.0-SNAPSHOT/maven-metadata.xml
> >> ...
> >>
> >> Now, add the following lines to your HBase pom.xml:
> >>
> >>    <repositories>
> >>      <repository>
> >> +      <id>localdump</id>
> >> +      <url>file:///home/cmccabe/localdump</url>
> >> +      <name>Local Dump</name>
> >> +      <snapshots>
> >> +        <enabled>true</enabled>
> >> +      </snapshots>
> >> +      <releases>
> >> +        <enabled>true</enabled>
> >> +      </releases>
> >> +    </repository>
> >> +    <repository>
> >>
> >> This will allow you to run something like:
> >> mvn test -Dtest=TestMiniClusterLoadSequential -PlocalTests
> >> -DredirectTestOutputToFile=true -Dhadoop.profile=2.0
> >> -Dhadoop.version=2.7.0-SNAPSHOT -Dcdh.hadoop.version=2.7.0-SNAPSHOT
> >>
> >> Once we do a new release of Hadoop with HTrace 3.1.0 this will get a lot
> >> easier.
> >>
> >> Related: Does anyone know what the best git branch to build from for
> >> HBase would be for this kind of testing?  I've been meaning to do some
> >> end to end testing (it's been on my TODO for a while)
> >>
> >> best,
> >> Colin
> >>
> >> On Wed, Feb 11, 2015 at 7:55 AM, Chunxu Tang <[email protected]
> <javascript:;>> wrote:
> >> > Hi all,
> >> >
> >> > Now I’m exploiting HTrace to trace request level data flows in HBase
> and
> >> > HDFS. I have successfully traced HBase and HDFS by using HTrace,
> >> > respectively.
> >> >
> >> > After that, I combine HBase and HDFS together and I want to just send
> a
> >> > PUT/GET request to HBase, but to trace the whole data flow in both
> HBase
> >> > and HDFS. In my opinion, when I send a request such as Get to HBase,
> it
> >> > will at last try to read the blocks on HDFS, so I can construct a
> whole
> >> > data flow tracing through HBase and HDFS. While, the fact is that I
> can
> >> > only get tracing data of HBase, with no data of HDFS.
> >> >
> >> > Could you give me any suggestions on how to trace the data flow in
> both
> >> > HBase and HDFS? Does anyone have similar experience? Do I need to
> modify
> >> > the source code? And maybe which part(s) should I touch? If I need to
> >> > modify the code, I will try to create a patch for that.
> >> >
> >> > Thank you.
> >> >
> >> > My Configurations:
> >> > Hadoop version: 2.6.0
> >> > HBase version: 0.99.2
> >> > HTrace version: htrace-master
> >> > OS: Ubuntu 12.04
> >> >
> >> >
> >> > Joshua
> >>
>

Re: Trace HBase/HDFS with HTrace

Reply via email to