See inline

On 10/28/2013 06:31 PM, Roman Shaposhnik wrote:
Hi Bruno!

Thanks a million for a details and through report. A few comments.

On Mon, Oct 28, 2013 at 2:47 AM, Bruno Mahé <bm...@apache.org> wrote:
I am not voting yet since I still have some time, but so far I am leaning
toward a -1.

I am learning toward a -1 because of
https://issues.apache.org/jira/browse/BIGTOP-1129

I would agree with that. The only trouble is I can't repro it :-(
Could you please provide additional details on the JIRA?

But if it is easy to repro (and I'm just missing something)
I'd agree with you.


I updated the ticket with some more information.
If you still cannot reproduce it, feel free to ping me and I will give you access to the instance. At some point I was wondering if it is because I am mixing the amazon ami with centos6 repos, but given every other service I try do work...


Things I still want to test (or rather, things I hope I can test by Tuesday
evening):
* Apache Pig and datafu
* Apache Solr

I can help with that. ;-) At least with the default use case. Let me know
if you're still interested.


Sure. Any recommendations?


Things we could do better:
* We could provide some templates for Apache Hadoop. I wasted a few hours
just to get the pi job running. Thankfully we have the init script for hdfs
(which needs some tweaks for the staging directory) and templates for the
configuration files in our puppet modules

Good point.

* I enabled short-circuit in Apache HBase. Not sure if I missed something,
but I got some "org.apache.hadoop.security.AccessControlException: Can't
continue with getBlockLocalPathInfo() authorization" exceptions. From
reading
http://www.spaggiari.org/index.php/hbase/how-to-activate-hbase-shortcircuit
it seems there are a few things we could do to make it work out of the box

Indeed! Would mind filing a JIRA for that. We can totally take care of this
thing in 0.8.0.


Thanks!
Will do!


* Not sure what I did wrong but although I could access Hue UI, most apps I
tried were not working. Ex: all shells give me the error "value 222 for UID
is less than the minimum UID allowed (500)". And the file browser gives me
the error "Cannot access: /. Note: You are a Hue admin but not a HDFS
superuser (which is "hdfs").". Note that the first user I created was a user
named "ec2-user". Although it is not an hdfs super user, I would expect to
have a working equivalent of what I can browse with the "hdfs -ls" command.
Also creating a hue user named "hdfs" yields the same result. Note that I
did not have time to dig further.

At this point Puppet code is the most reliable way to deploy Hue. Part of
it has to do that Hue now depends on way more external services than
it uses to -- which means that you have to get configs 'just right'. Our
puppet works great for that -- but that's for a fully distributed cluster.


I was not using puppet (on purpose).
I will give it another shot by looking at what our puppet recipes are doing and see if any of these changes can be baked directly into our packages.

Were you testing on pseudo distributed?


No. Fully distributed

* Phoenix directly embeds Apache Hadoop, Apache HBase and Apache Zookeeper
jars. These jars should be symlinks.

Looks like Andrew took care of that.

* Phoenix required me to delete some old Apache lucene jars from Apache
Flume installation directory. From the output of the command "mvn
dependency:tree" on the flume project, it appears these jars are only needed
for the ElasticSearch and MorphlineSolrSink plugins. but Flume documentation
for both of these plugin explicitly ask users to provide jars of Apache
Lucene and Apache Solr/ElasticSearch themselves (since they may use a
different version of Apache Lucene). So the dependency on Apache Lucene by
Apache Flume should probably be marked as "provided" and we should probably
provide some packages to manage these dependencies.

I'd love to dig deeper into this -- could you please file a JIRA with as
much INFO as possible?


Actually, it was the elasticsearch sink for flume which had me update Apache Lucene jars.

I will probably open a ticket against Apache Flume directly sometimes this week to ask for some clarification.


* I still need to figure out why my instance of Hue needs access to
google-analytics.com

WAT? Really. I'm pretty curious at this point


I had the same reaction.
But this was reported as blocked by noscript.
Also it seems to be a configuration activated by default: https://github.com/cloudera/hue/blob/master/desktop/conf.dist/hue.ini#L63


Other than that, it was an enjoyable experience to use Apache Bigtop
0.7.0RC0.
Doing SQL queries through Phoenix was pretty impressive and did not require
much work to setup.
Also seeing Apache Hadoop and Apache HBase logs being shipped by flume to
ElasticSearch and then being able to query events and create some dynamic
charts on kibana was exciting!

Great! Would you be willing to help with a blog post on the release? ;-)


Sure.
Do you have anything particular in mind?

Also, since I am about to test Apache Solr, is there an equivalent to Kibana
I can use for visualizing my indexed logs?

Well, there's Hue Search app. Have you had a chance to try it?


Not yet. But will do.


Thanks,
Roman.



Thanks,
Bruno

Reply via email to