Blueprint changed by James Page: Whiteboard changed: - Validation of Assumptions from spec: + Summary of objectives for Precise: - Ubuntu will package Apache Hadoop (rather than one of the various variants). - Cloudera - CDH - Hortonworks - Apache Hadoop - Employ 80% of upstream committers - OpenJDK Support - Ubuntu should be help drivng support for OpenJDK from upstreams - Packaging will align to Apache Bigtop (based on the most Ipopular upstream packaging) - YES - Packaging will focus on the most recent stable release of Hadoop - 0.203.0 - YES - Configuration methods should take into account integration with configuration management tools such as Puppet and Chef - YES - The majority of Java dependencies can be fulfilled through what is already in the archive (see hadoop-dependency-report.tar.gz) - kfs - this can be excluded to disable this feature but does not look like that much work to package. - Apache ftpserver would be required to enable smoke testing - again looks OK to package. - Focus will be on a solid Hadoop core with contrib packages if time permits. - Most dependencies are already in the archive apart from thrift (probably not an issue). - Native integrations must be part of the packaging. - Packages will target universe for this release. - - We need to ensure upstream co-operation from Hortonworks/Cloudera to ensure ongoing collaboration going forward. - - Good support for Hadoop on ARM should be an objective of this work. - - Comments from blueprint: - - I have to wonder if the demand for Hadoop really is large enough to justify the effort we'd be putting into providing it? Are we really at a point already where having terabytes of data you need to analyse is a common use case? - Soren - Sounds like there is demand in the distribution. - - important for Ubuntu Server, to maintain its position as 'best OS for the Cloud' - - the number of users needing to process TBs of data is just increasing; over the life of 12.04 LTS, more and more users will have a need for a map-reduce cluster application; having in the distro will ensure they pick Ubuntu for that application - - Work Items: - [m_3] hadoop community input (what about no thrift?): TODO - [jamespage] Active backport packaging process post 12.04 release: TODO - [m_3] Attend HadoopWorld. :): TODO + 1) Ubuntu will target packaging Apache Hadoop + - Help drive support for running under OpenJDK + - Packaging will retain flavour of most popular upstream packaging + - Focus will be on most recent stable (0.20.203.0)- validate with upstream release schedule + - Thrift support will not be included + - LZO and snappy compression options will be investigated + - universe target this release. + + 2) Juju Charms will by default align to distro packaging for precise + + Full sessions notes from UDS-P: + http://pad.ubuntu.com/uds-p-servercloud-p-hadoop + + Work items precise-alpha-1: + [m_3] Hadoop community input (what about no thrift? etc): TODO + [m_3] Attend HadoopWorld: TODO [james-page] Check on release schedule for Apache Hadoop between now and Feature Freeze: TODO [kirkland] Investigate upstream co-operation from Hortonworks/Cloudera to ensure ongoing collaboration going forward: TODO + + Work items precise-beta-1: [negronjl] adjust hadoop charms to have a configurable backend hadoop, get one into the charm repository: TODO [james-page] Package KFS for Ubuntu: TODO [james-page] Package Apache ftp-server for Ubuntu: TODO [james-page] Package Hadoop for Ubuntu: TODO + + Work Items: + [james-page] Active backport packaging post 12.04 release: TODO + [james-page] Feed back all work to Debian: TODO
-- Ubuntu Server + Hadoop and Bigdata https://blueprints.launchpad.net/ubuntu/+spec/servercloud-p-hadoop -- Ubuntu-server-bugs mailing list [email protected] Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs
