Re: [ANNOUNCE] New Bigtop PMC member: Yuqi Gu

2021-01-09 Thread Bruno Mahé

Congratulations!

On 1/7/21 1:02 AM, Youngwoo Kim (김영우) wrote:

Congratulations and Welcome Yuqi

- Youngwoo

2021년 1월 7일 (목) 오전 2:48, Evans Ye 님이 작성:


On behalf of the Apache Bigtop PMC, I am pleased to announce that
Yuqi Gu has been elected to the Bigtop Project Management
Committee. We appreciate Yuqi's contributions thus far, and look
forward to his continued involvement with his new role at Apache Bigtop.

Please join me in congratulating Yuqi Gu!


--
Thanks,
Bruno



Re: [Announce] ElastiCluster: a tool to deploy Hadoop/Spark clusters based on BigTop

2018-04-06 Thread Bruno Mahé

Wow! Really nice!


Thanks for sharing with us!


Thanks,

Bruno


On 04/04/2018 11:59 AM, Riccardo Murri wrote:

Hello!

I would like to bring to your attention ElastiCluster [1] [2], a tool
for deploy verious kinds of compute clusters on IaaS clouds.  Thanks to
BigTop (and to the developers behind it!), ElastiCluster can also deploy
functional Hadoop+Spark clusters [3].

ElastiCluster does not use the BigTop provisioner, instead opts for its
own Ansible-based deployment playbooks: the provisioned software is
currently limited to Hadoop + Spark + Thriftserver (from BigTop 1.2.1),
but they can be integrated with other non-BigTop software (e.g.,
JupyterHub).

AFAIK, the main use for Hadoop+Spark on ElastiCluster so far has been
setting up small clusters for teaching purposes; I'd be glad for any
feedback, and especially if anyone is willing to try it for more
"serious" use cases, as well as discussing more general topics (here or
on the ElastiCluster mailing-list).

[1]: http://elasticluster.readthedocs.io/en/latest/
[2]: http://elasticluster.readthedocs.io/en/latest/install.html#quickstart
[3]: http://elasticluster.readthedocs.io/en/latest/playbooks.html#hadoop-spark

(I hope this kind of announcements is welcome on the list; I could find
no policy on allowed topics on the BigTop web site and the mailing list index.)

Kind regards,
Riccardo

--
Riccardo Murri

S3IT: Services and Support for Science IT
University of Zurich




Re: Congratulations to our new Chair: Evans Ye

2017-03-02 Thread Bruno Mahé

Congratulations Evans!


Thanks,

Bruno


On 02/27/2017 09:45 PM, Olaf Flebbe wrote:
Last night the resultion to change the project chair passed the board 
unanimously:


Please join me in in congratulating Evans Ye on becoming our new Chair.

For me it was a pleasure to serve the community in the past year and 
like to thank everybody for supporting me doing the job.


Congrats again, Evans!


-- Olaf






Re: Rebooting the conversation on the Future of bigtop: Abstracting the backplane ? Containers?

2015-06-20 Thread Bruno Mahé
Echoing both Nate and Evans, I would not limit ourselves based on the 
technology used for the build.


However, I am not sure to completely follow option 3. We are doing that 
already for packages. For instance if package A depends on Apache 
Zookeeper., then the package A does depend on Apache Zookeeper and 
includes symlinks to the Apache Zookeeper library provided by the Apache 
Zookeeper package.



Thanks,
Bruno


On 06/19/2015 12:47 PM, n...@reactor8.com wrote:


Echoing Evans, think we should not be worried about stateless vs 
non-stateless containers.., seems core idea and need to is optimize 
the build process and maximize re-use whether on host or container 
machines or build environments.


Added sub-task with Olaf’s idea to Evans umbrella CI task, currently 
marked it for 1.1:


https://issues.apache.org/jira/browse/BIGTOP-1906

*From:*Evans Ye [mailto:evan...@apache.org]
*Sent:* Friday, June 19, 2015 7:16 AM
*To:* user@bigtop.apache.org
*Subject:* Re: Rebooting the conversation on the Future of bigtop: 
Abstracting the backplane ? Containers?


I thnk it's not a problem that container is not stateless. No matter 
how we should have CI jobs that builds all the artifacts and store 
them as official repos.
You point out an important thing that is the mvn install is the key 
feature to propergate self patched components around. If we disable 
this than there's no reason to build jars by ourselves. I'm +1 to 
option 2.


2015年6月19日 上午5:59於 Olaf Flebbe o...@oflebbe.de 
mailto:o...@oflebbe.de寫道:



 Am 18.06.2015 um 23:57 schrieb jay vyas
jayunit100.apa...@gmail.com mailto:jayunit100.apa...@gmail.com:

 You can easily share the artifacts with a docker shared volume

 in the container EXPORT M2_HOME=/container/m2/

 follwed by

 docker build -v ~/.m2/ /container/m2/  

 This will put the mvn jars into the host rather than the guest
conatainer, so that they persist.



Thats not the point. Containers are not stateless any more.

Olaf





Re: Rebooting the conversation on the Future of bigtop: Abstracting the backplane ? Containers?

2015-06-16 Thread Bruno Mahé

On 06/15/2015 09:22 AM, jay vyas wrote:
Hi folks.   Every few months, i try to reboot the conversation about 
the next generation of bigtop.


There are 3 things which i think we should consider : A backplane 
(rather than deploy to machines, the meaning of the term ecosystem 
in a post-spark in-memory apacolypse, and containerization.


1) BACKPLANE: The new trend is to have a backplane that provides 
networking abstractions for you (mesos, kubernetes, yarn, and so 
on).   Is it time for us to pick a resource manager?


2) ECOSYSTEM?: Nowadays folks don't necessarily need the whole hadoop 
ecosystem, and there is a huge shift to in-memory, monolithic stacks 
happening (i.e. gridgain or spark can do what 90% of the hadoop 
ecosystem already does, supporting streams, batch,sql all in one).


3) CONTAINERS:  we are doing a great job w/ docker in our build 
infra.  Is it time to start experimenting with running docker tarballs ?


Combining 1+2+3 - i could see a useful bigdata upstream distro which 
(1) just installed an HCFS implementation (gluster,HDFS,...) along 
side, say, (2) mesos as a backplane for the tooling for [[ hbase + 
spark + ignite ]] --- and then (3) do the integration testing of 
available mesos-framework plugins for ignite and spark underneath.  If 
other folks are interested, maybe we could create the 1x or 
in-memory branch to start hacking on it sometime ?Maybe even 
bring the flink guys in as well, as they are interested in bigtop 
packaging.




--
jay vyas



I have roughly the same position as Andrew on that matter.

What prevents you from starting something yourself to start hacking on it?


Thanks,
Bruno


Re: BIGTOP-1x branch.. Do we need multitenancy systems?

2015-02-11 Thread Bruno Mahé

On 02/10/2015 10:05 PM, Roman Shaposhnik wrote:

On Tue, Feb 10, 2015 at 6:00 PM, RJ Nowling rnowl...@gmail.com wrote:

Can we articulate the value of packages over tarballs?  In my view, packages
are useful for managing dependencies and in-place updates.

In my view packages are the only way to get into the traditional IT deployment
infrastructures. These are the same infrastructures that don't want to touch
Ambari at all, since they are all standardized on Puppet/Chef and traditional
Linux packaging.

There's quite a few of them out there still, despite all the push of
Silicon Valley
to get everybody to things like Docker, etc.


+1.

I like docker and it is a very nice project. But it is not going to be 
an end in itself.
Companies will continue to have various hosts, going from bare metal to 
different clouds providers (SaaS, PaaS...), docker included.


Aside from that, using packages provide so many benefits over tarballs:
* Packages have some metadata so I know what file belong where and how 
and what version
* all the dependencies are specified in it. Which makes it easier to 
reuse even across docker files. This includes system dependencies as 
well (ex: who depends on psmisc? why? can it be removed now that we 
updated Apache Hadoop?)
* it enables us to respect the Single Responsibility Principe and to 
satisfy everyone, folks using bare metal as well as cloud technologies users
* some patches may still need to be applied for compatibility/build 
reasons. Using packages makes that easier
* It provides a deep integration with the system so it just works. 
Users are created, initscripts setup, alternatives setup, everything has 
the right permissions...
* It makes it dead easy when I want to build multiple variants of the 
same image since everything is pulled and setup correctly. If I were to 
manually unpack tarballs, I would have to take care of that manually and 
also it would take a lot more space than the package equivalent unless I 
spend a lot of time deleting internal parts of each component. Example: 
I want hadoop client and fuse only for a variant.


Note that this could also be done with tarballs as well, but that would 
require a lot of duplication of command lines, trials and errors and 
wouldn't be as maintainable.


In conclusion, even if Apache Bigtop was to focus on docker, building 
packages would be much better than dropping them and going toward a 
'tarball' approach. Packages would not only be more maintainable, 
satisfy more use cases but would also provide an abstraction layer so 
the docker files could focus on the image itself instead of setting up 
the various combinations of Apache Hadoop components.
From a 10 000 ft view and in the big lines, docker is not much 
different than vagrant or boxgrinder. For those tools, having the recipe 
using the packages was simplifying a lot of things and I don't see why 
it would be different with docker.




Related question: what are BigTop's goals? Just integration testing?
Full blown distro targeted at end users? Packaging for others to build distros 
on top of?

All of the above? ;-) Seriously, I think we need to provide a way for consumers
of bigdata technology to be able to deploy it in the most efficient
way. This means
that we are likely to need to embrace different ways of packaging our stuff.

Thanks,
Roman.

+1 again

Another way to put it is to make the Apache Hadoop ecosystem usable.
That includes making it consumable as well as verifying that it all 
works together.
Packages have been the main way to consume such artifacts, but we have 
always been opened to other ways (see vagrant and boxgrinder). We even 
had at some point a kickstart image to build bootable usb keys with an 
out of the box working Apache Hadoop environment :)


If tomorrow packages become obsolete, I don't see why we could not drop 
them. But I think we are still far from that.



Thanks,
Bruno


Re: Meet up next week

2014-07-15 Thread Bruno Mahé

I should be able to make it, although I will probably leave work at 6pm.
I need to catch up with the latest changes :)

Thanks,
Bruno

On 07/15/2014 02:08 PM, Sean Mackrory wrote:

Adding dev back to Peter's email.

Let's go with Tuesday, July 22nd. Sounds like it is as good a day as any
next week, though it'll be a shame if Roman and Cos are indeed away the
whole week. You'll all just have to have another Bigtop meetup with out
me :) If people arrive early I'm happy to let them in and we can hang
out until the official start time.


On Tue, Jul 15, 2014 at 11:38 AM, Peter Linnell plinn...@apache.org
mailto:plinn...@apache.org wrote:


+1 for me. I will arrive a bit earlier to catch up with some other
Cloudera folks.

Do you have a specific date in mind, so we can add it to Meetup page ?

Thanks,

Peter



  On Mon, 14 Jul 2014 14:14:34 -0700
Julien Eid julien@cloudera.com
mailto:julien@cloudera.com wrote:

  +1, sounds like a good time.
 
 
  On Mon, Jul 14, 2014 at 2:09 PM, Sean Mackrory
mackror...@gmail.com mailto:mackror...@gmail.com
  wrote:
 
   Also, the obligatory pizza and beverages will be provided.
  
  
   On Mon, Jul 14, 2014 at 2:59 PM, Sean Mackrory
   mackror...@gmail.com mailto:mackror...@gmail.com wrote:
  
   Folks,
  
   I'd like to propose a little get-together at Cloudera's Palo Alto
   office some evening next week. I'm proposing next week primarily
   for selfish reasons, as I'll be in town and it'd be nice to see
   those in the Bigtop community who I haven't seen much since
   leaving California. It would also be a good time to perhaps look
   at the new Gradle build system, polish up some finishing touches
   on the 0.8.0 release, hack on other stuff or simply socialize. I'd
   be happy to arrange a remote hangout as well.
  
   Just to get a stake in the ground, shall we say, Tuesday (July
   26th) evening, perhaps 5:30 onward? Who would be able to come? Is
   there another time or day that might work better?
  
  
  






Re: Cluster Management: OpenSource Vendor Options

2013-12-30 Thread Bruno Mahé

On 12/30/2013 08:32 AM, Steven Núñez wrote:


The CDH, BigTop and HDP (I assume) base distributions require a lot of
manual configuration, so the best way to spin up a cluster with a
reasonable set of applications (say HDFS, YARN, Hive, HCatalog, HBase,
ZooKeeper, Oozie, Pig, Sqoop) is to use CDH + CM or Ambari + HDP.



Some people have also automated this through tools such as Puppet, Chef 
or Ansible.



Thanks,
Bruno


Re: bigtop manual: how will the contents evolve?

2013-12-11 Thread Bruno Mahé

On 12/11/2013 08:36 PM, Jay Vyas wrote:

Hi bigtop!

Just curious because  don't see any substantial contents in the docbook
yet, so wondering how it will evolve. Where will they go and how should
the bigtop folks contribute to them?  I'm dont know much about docbook
and how docbookX is meant to be used here, but it looks like a cool way
to propogate docs in the code.



Right now, there is no content :)

The way it works is:
* src/site/docbookx/apache-bigtop-user-guide.xml is the main entry point 
to the user guide. If we were to add another guide, we would simply 
create another similar file in that directory.


* You could put the entire document in that file 
apache-bigtop-user-guide.xml only, but for obvious reasons I split it in 
multiple files and import them through xi:include.


* Therefore each chapter of the user guide can be found in 
src/site/docbookx/userguide/. If a chapter gets too big, we could always 
split it again in multiple files




So any contribution should mostly go into one of the file dedicated to a 
chapter.
Regarding the content itself, in order to keep it organized I have put 
together a proposal for the list of chapters.

See my email about the documentation overhaul.
My plan is to add information here and there according to the list of 
chapters. I don't see any reason to work chapter by chapter.

So feel free to pick a chapter or a section and start sending patches.
Try to keep them small so they can be reviewed without too much troubles 
and also less chance for having patch stepping on each others. Also 
english is not my mother tongue, so I don't mind patches improving 
sentences.
So my main recommendation is to stick with the plan. Beyond that, any 
part is fair game if you want to contribute a paragraph or even just a 
sentence. Any help is welcome!



I will also volunteer to push the website on a regular basis so changes 
becomes visible without too much delay.




Thanks,
Bruno


Re: Best Way to Determine Package Versions

2013-11-20 Thread Bruno Mahé

On 11/20/2013 08:44 PM, Steven Núñez wrote:

Gents,

What is the best way to determine the particular version of  a BigTop
package? A command for this would be very useful. This particular use
case involves trying out the Oozie component according to the Running
Various BigTop Components
https://cwiki.apache.org/confluence/display/BIGTOP/Running+various+Bigtop+components
wiki page. The instructions for Oozie appear to be out of date, and the
Oozie website http://oozie.apache.org has different configurations for
different versions.

It seems that the Oozie instructions on that page are particularly out
of date.

Regards,
- SteveN


I assume you are referring to an installed package

On a RPM based GNU/Linux distribution:
rpm -qi package name
rpm -qf file or directory

Note that you can also use yum (yum info, yum list).


for deb based distributions, I don't have one handy right now, but the 
following link should help you in that regard:

http://www.debian.org/doc/manuals/debian-faq/ch-pkgtools.en.html


Thanks,
Bruno


Re: conf.empty

2013-11-17 Thread Bruno Mahé

You don't find them, you set them up.
ZOOKEEPER_HOSTNAME being the name of the host where zookeeper is running 
and HDFS_HOSTNAME the name of the host where the namenode is running.


On 11/15/2013 10:07 AM, ivaylo frankov wrote:

Hi Bruno,

Thank you very much for your email. It gives me hope ;).
Would you mind to tell me how I can find  ZOOKEEPER_HOSTNAME and
  HDFS_HOSTNAME in bigtop.
I suppose that HDFS_HOSTNAME is already started but it is also valid
for ZOOKEEPER_HOSTNAME?
How can I check that ?

Sorry for the funny questions but I am just perfect beginer. Hope thanks
Bigtop not very long ;)

Thank you once again and
Cheers,
Ivo

Am Freitag, 15. November 2013 schrieb Bruno Mahé :

On 11/10/2013 09:28 AM, ivaylo frankov wrote:

Dear All,

I installed bigtop 0.7.0 but after every restart of my computer
there is
a message that
/tmp folder can not be found. HBase tables are also deleted
after reset.

I tried to see configuration for the /TMP folder but when I start
ivo@ivo-Aspire-3830T:/usr/lib/__hadoop-hdfs/bin$ dpkg -L hadoop

/etc/hadoop/conf.empty
/etc/hadoop/conf.empty/__configuration.xsl
/etc/hadoop/conf.empty/hadoop-__env.sh
/etc/hadoop/conf.empty/ssl-__client.xml.example
/etc/hadoop/conf.empty/slaves
/etc/hadoop/conf.empty/hadoop-__metrics2.properties
/etc/hadoop/conf.empty/log4j.__properties
/etc/hadoop/conf.empty/hadoop-__policy.xml
/etc/hadoop/conf.empty/ssl-__server.xml.example
/etc/hadoop/conf.empty/core-__site.xml
/etc/hadoop/conf.empty/hadoop-__metrics.properties

and these files are empty.
Would you mind to tell me how I can start hadoop in pseudo mode
and to
configure the right directory for tmp to be able to keep HBase
tables.
Thank you very much!

Cheers,
Ivo



Hi Ivaylo,

Apache HBase package do not come with a pseudo conf package and that
is something I hope to fix at the next hackathon.
Also by default and if I remember correctly, Apache HBase will write
on disk on /tmp.

So you may want to look into Apache HBase documentation in order to
configure Apache HBase accordingly to your setup. Apache Bigtop
puppet recipes may also help since they provide working
configuration for a distributed cluster.

if it helps, here is a simple configuration I use sometimes for
hbase-site.xml:

configuration

   property
 namehbase.cluster.__distributed/name
   valuetrue/value
   /property

   property
 namehbase.zookeeper.quorum/__name
   valueZOOKEEPER_HOSTNAME/__value
   /property

   property
 namehbase.rootdir/name
   valuehdfs://HDFS_HOSTNAME:__8020/hbase/value
   /property

   property
 namedfs.support.append/__name
   valuetrue/value
   /property


/configuration

Please replace ZOOKEEPER_HOSTNAME and HDFS_HOSTNAME accordingly.
You can also replace the value of hbase.rootdir by a local directory
if you do not want to go through Apache Hadoop HDFS.



Thanks,
Bruno





Re: [VOTE] Bigtop 0.7.0 RC0

2013-10-30 Thread Bruno Mahé
Since we are going to spin a new Apache Bigtop 0.7.0 RC1, here is my 
formal -1 for Apache Bigtop 0.7.0 RC0.


See inline.

On 10/29/2013 03:49 PM, Roman Shaposhnik wrote:

On Mon, Oct 28, 2013 at 11:14 PM, Bruno Mahé bm...@apache.org wrote:

I would agree with that. The only trouble is I can't repro it :-(
Could you please provide additional details on the JIRA?

But if it is easy to repro (and I'm just missing something)
I'd agree with you.



I updated the ticket with some more information.
If you still cannot reproduce it, feel free to ping me and I will give you
access to the instance.
At some point I was wondering if it is because I am mixing the amazon ami
with centos6 repos, but given every other service I try do work...


Thanks for the details instructions on how to repro. It is indeed
easily reproducible and it is also extremely easy to fix.

Now, at this point, I'm absolutely in favor of spinning up RC1
with the following fixes in: BIGTOP-1132 and BIGTOP-1129

Both of those are really isolated fixes and here's what I'd like
to propose: I respin on Wed and send out a new VOTE thread.
This time, however, the voting will only go till noon on Sun 11/3.

If anybody objects to that -- please let me know ASAP.



Awesome!
Sounds like a great plan.

I came home too late to verify the patch of BIGTOP-1129, but hopefully I 
will be able to do so either tomorrow or Thursday.





The rest of my comments inline:


I can help with that. ;-) At least with the default use case. Let me know
if you're still interested.



Sure. Any recommendations?


Basically, you can simply follow this setup docs:
 
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/latest/Cloudera-Search-Installation-Guide/csig_install_search.html
and especially this part:
 
http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/latest/Cloudera-Search-Installation-Guide/csig_deploy_search_solrcloud.html

It also covers Hue Search app. It would very nice if somebody
can help come up with Bigtop-specific wiki docs, but for now
Cloudera Search is close enough.


At this point Puppet code is the most reliable way to deploy Hue. Part of
it has to do that Hue now depends on way more external services than
it uses to -- which means that you have to get configs 'just right'. Our
puppet works great for that -- but that's for a fully distributed cluster.



I was not using puppet (on purpose).
I will give it another shot by looking at what our puppet recipes are doing
and see if any of these changes can be baked directly into our packages.


Thanks! That would be appreciated.


Actually, it was the elasticsearch sink for flume which had me update Apache
Lucene jars.

I will probably open a ticket against Apache Flume directly sometimes this
week to ask for some clarification.


Please do. I believe I fixed some of those issues in upcoming Flume 1.5.0


WAT? Really. I'm pretty curious at this point



I had the same reaction.
But this was reported as blocked by noscript.
Also it seems to be a configuration activated by default:
https://github.com/cloudera/hue/blob/master/desktop/conf.dist/hue.ini#L63


Do you think we can convince Hue upstream to not spy
on its users by default? ;-)

I guess worst case scenario -- we can always disable it downstream in Bigtop.



I will start with opening a ticket with Hue and Bigtop.
And as you said, worst case scenario, that flag can be disabled over here.

I opened https://issues.apache.org/jira/browse/BIGTOP-1135 to track that 
effort.




Great! Would you be willing to help with a blog post on the release? ;-)



Sure.
Do you have anything particular in mind?


Just a bit more verbose version of the feedback you've provided on this
thread, I guess. Makes sense?




Makes sense.



Well, there's Hue Search app. Have you had a chance to try it?



Not yet. But will do.


Give it a try -- its like Google, but on your data ;-)

And the docs I quoted above cover it as well.

Thanks,
Roman.



Thanks!
Will take a look.


Thanks,
Bruno



Re: [VOTE] Bigtop 0.7.0 RC0

2013-10-28 Thread Bruno Mahé

On 10/18/2013 09:54 PM, Roman Shaposhnik wrote:

This is the seventh release for Apache Bigtop, version 0.7.0

It fixes the following issues:
   http://s.apache.org/Pkp

*** Please download, test and vote by Fri 10/25 noon PST

Note that we are voting upon the source (tag):
release-0.7.0-RC0

Source and binary files:
   
https://repository.apache.org/content/repositories/orgapachebigtop-194/org/apache/bigtop/bigtop/0.7.0/

Binary convenience artifacts:
http://bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.7.0/

Documentation on how to install (just make sure to adjust the repos for 0.7.0):
  
https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachebigtop-194/

The tag to be voted upon:

https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=commit;h=fb628180d289335dcf95641b44482fb680f11573

Bigtop's KEYS file containing PGP keys we use to sign the release:
http://svn.apache.org/repos/asf/bigtop/dist/KEYS

Thanks,
Roman.




I am not voting yet since I still have some time, but so far I am 
leaning toward a -1.


I am learning toward a -1 because of 
https://issues.apache.org/jira/browse/BIGTOP-1129 and my issues with Hue.
Other than that, everything I tested either just works out of the box or 
is nitpick.
But BIGTOP-1129 is what I would consider a blocker since it is part of 
the basic use case of Apache Bigtop.


Things I tested:
* Apache Hadoop and some basic jobs
* Apache HBase and Phoenix. Just basic testing
* Apache Flume sending Apache Hadoop and Apache HBase logs to an 
Elasticsearch instance and visualized through Kibana

* Apache Hue smoke tests
* Everything running on OpenJDK 6 on ec2 instances

Things I still want to test (or rather, things I hope I can test by 
Tuesday evening):

* Apache Pig and datafu
* Apache Solr
* Load more data into Phoenix


Things we could do better:
* As described on BIGTOP-1129, I could not stop datanode/namenode 
through init scripts.
* We could provide some templates for Apache Hadoop. I wasted a few 
hours just to get the pi job running. Thankfully we have the init script 
for hdfs (which needs some tweaks for the staging directory) and 
templates for the configuration files in our puppet modules
* I enabled short-circuit in Apache HBase. Not sure if I missed 
something, but I got some 
org.apache.hadoop.security.AccessControlException: Can't continue with 
getBlockLocalPathInfo() authorization exceptions. From reading 
http://www.spaggiari.org/index.php/hbase/how-to-activate-hbase-shortcircuit 
it seems there are a few things we could do to make it work out of the box
* Not sure what I did wrong but although I could access Hue UI, most 
apps I tried were not working. Ex: all shells give me the error value 
222 for UID is less than the minimum UID allowed (500). And the file 
browser gives me the error Cannot access: /. Note: You are a Hue admin 
but not a HDFS superuser (which is hdfs).. Note that the first user I 
created was a user named ec2-user. Although it is not an hdfs super 
user, I would expect to have a working equivalent of what I can browse 
with the hdfs -ls command. Also creating a hue user named hdfs 
yields the same result. Note that I did not have time to dig further.
* Phoenix directly embeds Apache Hadoop, Apache HBase and Apache 
Zookeeper jars. These jars should be symlinks.
* Phoenix required me to delete some old Apache lucene jars from Apache 
Flume installation directory. From the output of the command mvn 
dependency:tree on the flume project, it appears these jars are only 
needed for the ElasticSearch and MorphlineSolrSink plugins. but Flume 
documentation for both of these plugin explicitly ask users to provide 
jars of Apache Lucene and Apache Solr/ElasticSearch themselves (since 
they may use a different version of Apache Lucene). So the dependency on 
Apache Lucene by Apache Flume should probably be marked as provided 
and we should probably provide some packages to manage these dependencies.
* I still need to figure out why my instance of Hue needs access to 
google-analytics.com



Other than that, it was an enjoyable experience to use Apache Bigtop 
0.7.0RC0.
Doing SQL queries through Phoenix was pretty impressive and did not 
require much work to setup.
Also seeing Apache Hadoop and Apache HBase logs being shipped by flume 
to ElasticSearch and then being able to query events and create some 
dynamic charts on kibana was exciting!



Also, since I am about to test Apache Solr, is there an equivalent to 
Kibana I can use for visualizing my indexed logs?



Thanks,
Bruno


Re: [VOTE] Bigtop 0.7.0 RC0

2013-10-25 Thread Bruno Mahé

On 10/18/2013 09:54 PM, Roman Shaposhnik wrote:

This is the seventh release for Apache Bigtop, version 0.7.0

It fixes the following issues:
   http://s.apache.org/Pkp

*** Please download, test and vote by Fri 10/25 noon PST

Note that we are voting upon the source (tag):
release-0.7.0-RC0

Source and binary files:
   
https://repository.apache.org/content/repositories/orgapachebigtop-194/org/apache/bigtop/bigtop/0.7.0/

Binary convenience artifacts:
http://bigtop01.cloudera.org:8080/view/Releases/job/Bigtop-0.7.0/

Documentation on how to install (just make sure to adjust the repos for 0.7.0):
  
https://cwiki.apache.org/confluence/display/BIGTOP/How+to+install+Hadoop+distribution+from+Bigtop+0.6.0

Maven staging repo:
https://repository.apache.org/content/repositories/orgapachebigtop-194/

The tag to be voted upon:

https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=commit;h=fb628180d289335dcf95641b44482fb680f11573

Bigtop's KEYS file containing PGP keys we use to sign the release:
http://svn.apache.org/repos/asf/bigtop/dist/KEYS

Thanks,
Roman.




I haven't had time to test it yet, but should be able to get to it this 
week end.

Would it be possible to postpone the end of the vote until Sunday?

Also I could not find any jenkins job with all our tests running against 
this release.

It would be great to put such a link in the vote email.


Thanks,
Bruno


Re: permanant builds on jenkins

2013-07-06 Thread Bruno Mahé

On 07/06/2013 07:33 AM, Jay Vyas wrote:

Hi bigtop:

Are there any permanant builds saved on jenkins (for the VM matrix)?

If not it would be nice to add them for certain known well tested,
working disk images .

(for context, I'm currently running Mr2 build of the KVM box and it
appears to have some intermittent write issues on the DataNode path, and
also, my namenode appears to really like being in safe mode.  these
could just be due to VM setup though, as im changing some things like
adding static IPs and data node write paths... so nothing to be alarmed
about.)

--
Jay Vyas
http://jayunit100.blogspot.com


Hi Jay,

Could you defined permanent build ?
I am not sure if this fits your requirement, but jenkins has a link to 
the latest successful build (ex: 
http://bigtop01.cloudera.org:8080/job/Bigtop-VM-matrix/BR=master,KIND=kvm,label=fedora16/lastSuccessfulBuild/artifact/bigtop-vm-kvm-master.tar.gz 
)


We do not store convenient artifacts of VMs since no one has asked about 
it before.
So ideally, known well tested working disk images would be the ones from 
Apache Bigtop releases. But right now, there is not much testing of our 
VMs. But any help on that front would be welcome!


Note also that I added that VM more as a base VM for an Apache Hadoop 
cloud image than a developer VM. That's why there is not much in it as 
well as no desktop pre-configured.
So depending on your needs, we may want to add a new VM or enhance the 
current one (also, boxgrinder enables inheritance between appliances).


Also Boxgrinder is apparently not being maintained anymore. So we may 
want to look into other VM builders (Oz, etc.)



Thanks,
Bruno


Re: Building bigtop with patches

2013-03-06 Thread Bruno Mahé

On 03/01/2013 05:34 PM, Jagat Singh wrote:

Hello Roman,

Thank you for your answers.

I was trying to make hadoop deb with few patches and was wondering why
its not working :)

And now i know that with dep , rpm packages patches wont be picked from
patch folder. However if i want tar then i would get patches applied.

Any suggestions for making patched deb ?

Thanks once again.




Hi Jagat,

Sorry for the late reply and hope you fixed your issue by now, but in 
case you still need an answer, you can look at 
https://github.com/apache/bigtop/commit/d2d1df918406774ae3ad2489de4e2eccf8e54634


This coommit *remove* a patch to build. So the interested parts are in red.

Feel free to let us know if you need more information.

Thanks,
Bruno



Re: BigTop 0.3.0 vs make hadoop-deb

2013-02-12 Thread Bruno Mahé

On 02/08/2013 09:38 AM, Jean-Marc Spaggiari wrote:

Hi,

I'm trying to bild hadoop-deb with bigtop 0.3.0 to get the libdfs
files because I'm not able to compile them manually.

However, seems that bigtop is not able too ;)

First, the script try to retrieve the 1.0.1 version of hadoop, but
it's no more available in the website. I was able to find it in
another folder and I placed it into the dl folder.

Now, it's failing on the parsechangelog part of the script :(

http://pastebin.com/aaPyWt7M

Any idea how I can fix that? I tried to replace the 1.0.1 jar by the
1.0.3 (the one I need) and got the exact same error.

Thanks,

JM




Hi Jean-Marc,

This ought to work, so there must be some issue somewhere.
The only thing I found from googling around is some reference to the 
french locale not being parsed correctly by the deb scripts.


So could you try setting your locale in english so dates do not contain 
accents?



Thanks,
Bruno