Re: [VOTE] Accept Drill into the Apache Incubator
+1 (blinding) Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm From: Ted Dunning ted.dunn...@gmail.com To: general@incubator.apache.org Sent: Tuesday, August 7, 2012 10:41 PM Subject: [VOTE] Accept Drill into the Apache Incubator I would like to call a vote for accepting Drill for incubation in the Apache Incubator. The full proposal is available below. Discussion over the last few days has been quite positive. Please cast your vote: [ ] +1, bring Drill into Incubator [ ] +0, I don't care either way, [ ] -1, do not bring Drill into Incubator, because... This vote will be open for 72 hours and only votes from the Incubator PMC are binding. The start of the vote is just before 3AM UTC on 8 August so the closing time will be 3AM UTC on 11 August. Thank you for your consideration! Ted http://wiki.apache.org/incubator/DrillProposal = Drill = == Abstract == Drill is a distributed system for interactive analysis of large-scale datasets, inspired by [[http://research.google.com/pubs/pub36632.html|Google's Dremel]]. == Proposal == Drill is a distributed system for interactive analysis of large-scale datasets. Drill is similar to Google's Dremel, with the additional flexibility needed to support a broader range of query languages, data formats and data sources. It is designed to efficiently process nested data. It is a design goal to scale to 10,000 servers or more and to be able to process petabyes of data and trillions of records in seconds. == Background == Many organizations have the need to run data-intensive applications, including batch processing, stream processing and interactive analysis. In recent years open source systems have emerged to address the need for scalable batch processing (Apache Hadoop) and stream processing (Storm, Apache S4). In 2010 Google published a paper called Dremel: Interactive Analysis of Web-Scale Datasets, describing a scalable system used internally for interactive analysis of nested data. No open source project has successfully replicated the capabilities of Dremel. == Rationale == There is a strong need in the market for low-latency interactive analysis of large-scale datasets, including nested data (eg, JSON, Avro, Protocol Buffers). This need was identified by Google and addressed internally with a system called Dremel. In recent years open source systems have emerged to address the need for scalable batch processing (Apache Hadoop) and stream processing (Storm, Apache S4). Apache Hadoop, originally inspired by Google's internal MapReduce system, is used by thousands of organizations processing large-scale datasets. Apache Hadoop is designed to achieve very high throughput, but is not designed to achieve the sub-second latency needed for interactive data analysis and exploration. Drill, inspired by Google's internal Dremel system, is intended to address this need. It is worth noting that, as explained by Google in the original paper, Dremel complements MapReduce-based computing. Dremel is not intended as a replacement for MapReduce and is often used in conjunction with it to analyze outputs of MapReduce pipelines or rapidly prototype larger computations. Indeed, Dremel and MapReduce are both used by thousands of Google employees. Like Dremel, Drill supports a nested data model with data encoded in a number of formats such as JSON, Avro or Protocol Buffers. In many organizations nested data is the standard, so supporting a nested data model eliminates the need to normalize the data. With that said, flat data formats, such as CSV files, are naturally supported as a special case of nested data. The Drill architecture consists of four key components/layers: * Query languages: This layer is responsible for parsing the user's query and constructing an execution plan. The initial goal is to support the SQL-like language used by Dremel and [[https://developers.google.com/bigquery/docs/query-reference|Google BigQuery]], which we call DrQL. However, Drill is designed to support other languages and programming models, such as the [[http://www.mongodb.org/display/DOCS/Mongo+Query+Language|Mongo Query Language]], [[http://www.cascading.org/|Cascading]] or [[https://github.com/tdunning/Plume|Plume]]. * Low-latency distributed execution engine: This layer is responsible for executing the physical plan. It provides the scalability and fault tolerance needed to efficiently query petabytes of data on 10,000 servers. Drill's execution engine is based on research in distributed execution engines (eg, Dremel, Dryad, Hyracks, CIEL, Stratosphere) and columnar storage, and can be extended with additional operators and connectors. * Nested data formats: This layer is responsible for supporting various data formats. The initial goal is to support the column-based format used by Dremel. Drill is designed to support schema-based formats such as Protocol Buffers/Dremel, Avro/AVRO-806/Trevni and CSV,
Re: [PROPOSAL] Drill for the Apache Incubator
I concur with Andrzej. Let's see that VOTE Ted! Otis Performance Monitoring for Solr / ElasticSearch / HBase - http://sematext.com/spm From: Andrzej Bialecki a...@getopt.org To: general@incubator.apache.org Sent: Tuesday, August 7, 2012 5:51 PM Subject: Re: [PROPOSAL] Drill for the Apache Incubator On 07/08/2012 21:14, Franklin, Matthew B. wrote: -Original Message- From: Marvin Humphrey [mailto:mar...@rectangular.com] Sent: Monday, August 06, 2012 12:25 PM To: general@incubator.apache.org Cc: Grant Ingersoll; Isabel Drost Subject: Re: [PROPOSAL] Drill for the Apache Incubator On Thu, Aug 2, 2012 at 3:12 PM, Ted Dunning ted.dunn...@gmail.com wrote: Initial Source == There is no initial source code. All source code will be developed within the Apache Incubator. Coming in without any source code is going to pose a challenge to this podling. http://www.apache.org/foundation/how-it-works.html#incubator The incubator filters projects on the basis of the likeliness of them becoming successful meritocratic communities. The basic requirements for incubation are: * a working codebase -- over the years and after several failures, the foundation came to understand that without an initial working codebase, it is generally hard to bootstrap a community. This is because merit is not well recognized by developers without a working codebase. Also, the friction that is developed during the initial design stage is likely to fragment the community. It seems like there could be flexibility in this requirement, based on a few factors. In this case, a design discussion has been ongoing; but I would also think that any community coming in with enough people who know the Apache way may also not need as much of a solid starting point code wise. +1. Given the credentials and the experience of proposed committers and mentors, and the fact that the initial design is already done, I don't think this is a serious risk. And it's an exciting proposal with a potentially big impact. -- Best regards, Andrzej Bialecki http://www.sigram.com, blog http://www.sigram.com/blog ___.,___,___,___,_._. __ [___||.__|__/|__||\/|: Information Retrieval, System Integration ___|||__||..\|..||..|: Contact: info at sigram dot com - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] accept DirectMemory as new Apache Incubator podling
+1 (member) Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ From: Simone Tripodi simonetrip...@apache.org To: general@incubator.apache.org Sent: Sunday, October 2, 2011 3:36 AM Subject: [VOTE] accept DirectMemory as new Apache Incubator podling Hi all guys, I'm now calling a formal VOTE on the DirectMemory proposal located here: http://wiki.apache.org/incubator/DirectMemoryProposal Proposal text copied at the bottom of this email. VOTE close on Tuesday, October 4, early 7:30 AM CET. Please VOTE: [ ] +1 Accept DirectMemory into the Apache Incubator [ ] +0 Don't care [ ] -1 Don't Accept DirectMemory into the Apache Incubator because... Thanks in advance for participating! All the best, have a nice day, Simo P.S. Here's my +1 http://people.apache.org/~simonetripodi/ http://www.99soft.org/ = DirectMemory = == Abstract == The following proposal is about Apache !DirectMemory, a Java !OpenSource multi-layered cache implementation featuring off-heap memory storage (a-la Terracotta !BigMemory) to enable caching of Java objects without degrading JVM performance == Proposal == !DirectMemory's main purpose is to to act as a second level cache (after a heap based one) able to store large amounts of data without filling up the Java heap and thus avoiding long garbage collection cycles. Although serialization has a runtime cost store/retrieve operations are in the sub-millisecond range being pretty acceptable in every usage scenario even as a first level cache and, most of all, outperforms heap storage when the count of the entries goes over a certain amount. !DirectMemory implements cache eviction based on a simple LFU (Least Frequently Used) algorythm and also on item expiration. Included in the box is a small set of utility classes to easily handle off-heap memory buffers. == Background == !DirectMemory is a project was born in the 2010 thanks to Raffaele P. Guidi initial effort under [[https://github.com/raffaeleguidi/!DirectMemory/|GitHub]] and already licensed under the Apache License 2.0. == Rationale == The rationale behind !DirectMemory is bringing off-heap caching to the open source world, empowering FOSS developers and products with a tool that enables breaking the heap barrier and override the JVM garbage collection mechanism collection - which could be useful in scenarios where RAM needs are over the usual limits (more than 8, 12, 24gb) and to ease usage of off-heap memory in general = Current Status = == Meritocracy == As a majority of the initial project members are existing ASF committers, we recognize the desirability of running the project as a meritocracy. We are eager to engage other members of the community and operate to the standard of meritocracy that Apache emphasizes; we believe this is the most effective method of growing our community and enabling widespread adoption. == Core Developers == In alphabetical order: * Christian Grobmeier grobmeier at apache dot org * Maurizio Cucchiara mcucchiara at apache dot org * Olivier Lamy olamy at apache dot org * Raffaele P. Guidi raffaele dot p dot guidi at gmail dot com * Simone Gianni simoneg at apache dot org * Simone Tripodi simonetripodi at apache dot org * Tommaso Teofili tommaso at apache dot org == Alignment == The purpose of the project is to develop and maintain !DirectMemory implementation that can be used by other Apache projects. = Known Risks = == Orphaned Products == !DirectMemory does not have any reported production usage, yet, but is getting traction with developers and being evaluated by potential users and thus the risks of it being orphaned are minimal == Inexperience with Open Source == All of the committers have experience working in one or more open source projects inside and outside ASF. == Homogeneous Developers == The list of initial committers are geographically distributed across the Europe with no one company being associated with a majority of the developers. Many of these initial developers are experienced Apache committers already and all are experienced with working in distributed development communities. == Reliance on Salaried Developers == To the best of our knowledge, none of the initial committers are being paid to develop code for this project. == Relationships with Other Apache Products == !DirectMemory fits naturally in the ASF because it could be successfully employed together with a large number of ASF products ranging from JCS - as a new cache region between the heap and indexed file ones, to ORM systems like Cayenne (i.e. replacing current OSCache based implementation), Apache JDO and JPA implementations and also java based databases (i.e. Derby) and all systems managing large amounts of data from Hadoop to Cassandra == A Excessive Fascination with the Apache Brand == While the Apache Software Foundation would be a good home for the !DirectMemory project it already has
Re: [VOTE] S4 to join the Incubator
+1 Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ From: Patrick Hunt ph...@apache.org To: general@incubator.apache.org Sent: Tuesday, September 20, 2011 4:56 PM Subject: [VOTE] S4 to join the Incubator It's been a nearly a week since the S4 proposal was submitted for discussion. A few questions were asked, and the proposal was clarified in response. Sufficient mentors have volunteered. I thus feel we are now ready for a vote. The latest proposal can be found at the end of this email and at: http://wiki.apache.org/incubator/S4Proposal The discussion regarding the proposal can be found at: http://s.apache.org/RMU Please cast your votes: [ ] +1 Accept S4 for incubation [ ] +0 Indifferent to S4 incubation [ ] -1 Reject S4 for incubation This vote will close 72 hours from now. Thanks, Patrick -- = S4 Proposal = == Abstract == S4 (Simple Scalable Streaming System) is a general-purpose, distributed, scalable, partially fault-tolerant, pluggable platform that allows programmers to easily develop applications for processing continuous, unbounded streams of data. == Proposal == S4 is a software platform written in Java. Clients that send and receive events can be written in any programming language. S4 also includes a collection of modules called Processing Elements (or PEs for short) that implement basic functionality and can be used by application developers. In S4, keyed data events are routed with affinity to Processing Elements (PEs), which consume the events and do one or both of the following: (1) ''emit'' one or more events which may be consumed by other PEs, (2) ''publish'' results. The architecture resembles the Actors model, providing semantics of encapsulation and location transparency, thus allowing applications to be massively concurrent while exposing a simple programming interface to application developers. To drive adoption and increase the number of contributors to the project, we may need to prioritize the focus based on feedback from the community. We believe that one of the top priorities and driving design principle for the S4 project is to provide a simple API that hides most of the complexity associated with distributed systems and concurrency. The project grew out of the need to provide a flexible platform for application developers and scientists that can be used for quick experimentation and production. S4 differs from existing Apache projects in a number of fundamental ways. Flume is an Incubator project that focuses on log processing, performing lightweight processing in a distributed fashion and accumulating log data in a centralized repository for batch processing. S4 instead performs all stream processing in a distributed fashion and enables applications to form arbitrary graphs to process streams of events. We see Flume as a complementary project. We also expect S4 to complement Hadoop processing and in some cases to supersede it. Kafka is another Incubator project that focuses on processing large amounts of stream data. The design of Kafka, however, follows the pub-sub paradigm, which focuses on delivering messages containing arbitrary data from source processes (publishers) to consumer processes (subscribers). Compared to S4, Kafka is an intermediate step between data generation and processing, while S4 is itself a platform for processing streams of events. S4 overall addresses a need of existing applications to process streams of events beyond moving data to a centralized repository for batch processing. It complements the features of existing Apache projects, such as Hadoop, Flume, and Kafka, by providing a flexible platform for distributed event processing. == Background == S4 was initially developed at Yahoo! Labs starting in 2008 to process user feedback in the context of search advertising. The project was licensed under the Apache License version 2.0 in October 2010. The project documentation is currently available at http://s4.io . == Rationale == Stream computing has been growing steadily over the last 20 years. However, recently there has been an explosion in real-time data sources including the Web, sensor networks, financial securities analysis and trading, traffic monitoring, natural language processing of news and social data, and much more. As Hadoop evolved as a standard open source solution for batch processing of massive data sets, there is no equivalent community supported open source platform for processing data streams in real-time. While various research projects have evolved into proprietary commercial products, S4 has the potential to fill the gap. Many projects that require a scalable stream processing architecture currently use Hadoop by segmenting the input stream into data batches. This solution is not efficient, results in high latency, and introduces unnecessary complexity. The S4 design is
Re: [DISCUSS] DirectMemory to join the Apache Incubator
Oh I hope this gets into ASF. I mentioned DirectMemory on one of the Apache MLs the other day in the context of somebody at Facebook? or Cloudera? working on something very similar for another ASF project maybe HBase. Just mentioning it. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ From: Simone Tripodi simonetrip...@apache.org To: general@incubator.apache.org Sent: Tuesday, September 20, 2011 5:48 AM Subject: [DISCUSS] DirectMemory to join the Apache Incubator Hi all guys, I would like to propose DirectMemory, a Java OpenSource multi-layered cache implementation featuring off-heap memory storage (a-la Terracotta BigMemory) originally developed by Raffaele P. Guidi on GitHub[1], to be an Apache Incubator project. For those interested on knowing more about DirectMemory, you can read Raffaele's related blog[2]. Here's a link to the proposal in the Incubator wiki[3] where we started collecting all needed info. As you will note, the list of mentors is in need of some volunteers, so if you find this interesting, feel free to sign up or let us know you are interested :). Hope to read from you soon, thanks in advance and have a nice day! All the best, Simo [1] https://github.com/raffaeleguidi/DirectMemory [2] http://raffaeleguidi.wordpress.com/ [3] http://wiki.apache.org/incubator/DirectMemoryProposal http://people.apache.org/~simonetripodi/ http://www.99soft.org/ - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [PROPOSAL] Flume for the Apache Incubator
Looks good to me, Jon. We've contributed to Flume before and plan on making at least a few more contributions in the near future. I look forward to doing that under ASF. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Jonathan Hsieh j...@cloudera.com To: general@incubator.apache.org Sent: Fri, May 27, 2011 10:18:33 AM Subject: [PROPOSAL] Flume for the Apache Incubator Howdy! I would like to propose Flume to be an Apache Incubator project. Flume is a distributed, reliable, and available system for efficiently collecting, aggregating, and moving large amounts of log data to scalable data storage systems such as Apache Hadoop's HDFS. Here's a link to the proposal in the Incubator wiki http://wiki.apache.org/incubator/FlumeProposal I've also pasted the initial contents below. Thanks! Jon. = Flume - A Distributed Log Collection System = == Abstract == Flume is a distributed, reliable, and available system for efficiently collecting, aggregating, and moving large amounts of log data to scalable data storage systems such as Apache Hadoop's HDFS. == Proposal == Flume is a distributed, reliable, and available system for efficiently collecting, aggregating, and moving large amounts of log data from many different sources to a centralized data store. Its main goal is to deliver data from applications to Hadoop’s HDFS. It has a simple and flexible architecture for transporting streaming event data via flume nodes to the data store. It is robust and fault-tolerant with tunable reliability mechanisms that rely upon many failover and recovery mechanisms. The system is centrally configured and allows for intelligent dynamic management. It uses a simple extensible data model that allows for lightweight online analytic applications. It provides a pluggable mechanism by which new sources, destinations, and analytic functions which can be integrated within a Flume pipeline. == Background == Flume was initially developed by Cloudera to enable reliable and simplified collection of log information from many distributed sources. It was later open-sourced by Cloudera on GitHub as an Apache 2.0 licensed project in June 2010. During this time Flume has been formally released five times as versions 0.9.0 (June 2010), 0.9.1 (Aug 2010), 0.9.1u1 (Oct 2010), 0.9.2 (Nov 2010), and 0.9.3 (Feb 2011). These releases are also distributed by Cloudera as source and binaries along with enhancements as part of Cloudera Distribution including Apache Hadoop (CDH). == Rationale == Collecting log information in a data center in a timely, reliable, and efficient manner is a difficult challenge but important because when aggregated and analyzed, log information can yield valuable business insights. We believe that users and operators need a manageable systematic approach for log collection that simplifies the creation, the monitoring, and the administration of reliable log data pipelines. Oftentimes today, this collection is attempted by periodically shipping data in batches and by using potentially unreliable and inefficient ad-hoc methods. Log data is typically generated in various systems running within a data center that can range from a few machines to hundreds of machines. In aggregate, the data acts like a large-volume continuous stream with contents that can have highly-varied format and highly-varied content. The volume and variety of raw log data makes Apache Hadoop's HDFS file system an ideal storage location before the eventual analysis. Unfortunately, HDFS has limitations with regards to durability as well as scaling limitations when handling a large number of low-bandwidth connections or small files. Similar technical challenges are also suffered when attempting to write data to other data storage services. Flume addresses these challenges by providing a reliable, scalable, manageable, and extensible solution. It uses a streaming design for capturing and aggregating log information from varied sources in a distributed environment and has centralized management features for minimal configuration and management overhead. == Initial Goals == Flume is currently in its first major release with a considerable number of enhancement requests, tasks, and issues recorded towards its future development. The initial goal of this project will be to continue to build community in the spirit of the Apache Way, and to address the highly requested features and bug-fixes towards the next dot release. Some goals include: * To stand up a sustaining Apache-based community around the Flume codebase. * Implementing core functionality of a usable highly-available Flume master. * Performance, usability, and robustness improvements. * Improving the ability to
Re: Name change from Lucene Connectors Framework to Apache Connectors Framework
Here's a non-abstract one: - Apache Data (Source?) Connectors? Perhaps Data (Source) would make it clear what this is about. Otis - Original Message From: Benson Margulies bimargul...@gmail.com To: general@incubator.apache.org Sent: Mon, August 30, 2010 1:23:35 PM Subject: Re: Name change from Lucene Connectors Framework to Apache Connectors Framework It seems to me that the pivotal problem here is the word connector. On the one hand, it could mean almost anything to almost anyone. On the other hand, it has a specific denotation in the vicinity of httpd. Everything at Apache is in the vicinity of httpd. I'd offer the following 'made-up' options, all following Apache: - manifold (many connections) - omnivore (eats anything) - rapunzel (spins straw into gold) - diogenes (seeking for something) - lantern (ditto) - helium (fuel for Solr) The whole question of brand management strikes me as interesting: is it, in fact, the job of the incubator PMC to groom the Apache branding portfolio by guiding new projects towards better names? Is that in our charter, or should we, as Chris suggests, defer to someone else for problems in this area. On Mon, Aug 30, 2010 at 1:12 PM, Mattmann, Chris A (388J) chris.a.mattm...@jpl.nasa.gov wrote: Guys, If I may: since we're discussing marks, why not post to trademarks@ and ask Shane and crew to weigh in? Maybe you have already, but if so, I haven't seen that discussion mentioned over here on gene...@incubator. Thanks! Cheers, Chris On 8/30/10 10:03 AM, Grant Ignersoll gsing...@apache.org wrote: On Aug 27, 2010, at 12:15 PM, David Jencks wrote: To try to illustrate my thinking rather than push a name down your throat... Open ConnectorFramework/OpenConnectorFramework/OpenCF OK, since you've added a branding word. Not ideal since the purpose appears overly broad Content Connector Framework/ContentConnectorFramework/CCF OK, since you've clarified the scope. Not ideal since has no branding word. OpenContentConnectorFramework/OpenCCF better since it clarifies the scope and includes a branding word. So, the word open somehow alleviates your concern? I don't get that. If your objection is that it comes across as being _the_ Apache connector library, then how does Open modulate that? It's still the Apache Open Connector Framework. It's still descriptive and still implies it's the one. Besides, it's the ASF, isn't Open implied/redundant? We would never have the Apache Closed Connector Framework, right? Likewise, the word Content implies the same only status, albeit here I will give you that it distinguishes it from Tomcat Connector somewhat, although the Tomcat Connector is just that, the Tomcat connector. However, I still don't buy that it is a branding word. Content is pretty much meaningless. Everything is content. I have no doubt that we could write a plugin for ACF that connected to Tomcat and got Content out of it. Heck, we already do. It's called a web crawler. So, that leaves us, in my mind w/ the option of some made up name or we stick w/ ACF. I'm all for a made up one if someone comes up with one, I just don't know what it is and no one in the community seems to have one either. ACF fits and the community likes it. It's not unprecedented at the ASF and I don't think it is confusing with Tomcat Connector. At any rate, the community would like some resolution. Should I just call an official vote on ACF and if it loses then we will go back to the drawing board? -Grant - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org ++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: chris.mattm...@jpl.nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++ - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Move Lucy to the Incubator
+1 - Original Message From: Chris Hostetter hossman_incuba...@fucit.org To: general@incubator.apache.org Sent: Sat, July 17, 2010 6:23:09 PM Subject: [VOTE] Move Lucy to the Incubator I would like to call a vote for accepting Apache Lucy for incubation in the Apache Incubator. The full proposal is available below. We ask the Incubator PMC to sponsor it, with myself (hossman) as Champion, and mattmann, upayavira, mikemccand, and hossman volunteering to be Mentors. Please cast your vote: [ ] +1, bring Lucy into Incubator [ ] +0, I don't care either way, [ ] -1, do not bring Lucy into Incubator, because... This vote will be open for 72 hours and only votes from the Incubator PMC are binding. http://wiki.apache.org/incubator/LucyProposal PREFACE Lucy is a sub-project which is being spun off from the Lucene TLP but is not yet ready for graduation. We propose to address certain needs of the project by transitioning to an Incubator Podling, and assimilating the KinoSearch codebase. ABSTRACT Lucy will be a loose port of the Lucene search engine library, written in C and targeted at dynamic language users. PROPOSAL Lucy has two aims. First, it will be a high-performance C search engine library. Second, it will maximize its usability and power when accessed via dynamic language bindings. To that end, it will present highly idiomatic, carefully tailored APIs for each of its host binding languages, including support for subclasses written entirely in the host language. BACKGROUND Lucy, a loose C port of Java Lucene, began as an ambitious, from-scratch Lucene sub-project, with David Balmain (author of Ferret, a Ruby/C port of Lucene), Doug Cutting, and Marvin Humphrey (founder of KinoSearch, a Perl/C port) as committers. During an initial burst of activity, the overall architecture for Lucy was sketched out by Dave and Marvin. Unfortunately, Dave became unavailable soon after, and without a working codebase to release or any users, it proved difficult to replace him. Still, Marvin carried on their work throughout a period of seemingly low activity. In the last year, that work has come to fruition: major technical milestones have been achieved and Lucy's underpinnings have been completed. Additionally, other developers from the KinoSearch community have taken an interest in Lucy and have begun to ramp up their contributions. The next steps for Lucy were articulated by the Lucene PMC in a recent review: make releases, acquire users, grow community. To implement the Lucene PMC's recommendations and get to a release as quickly as possible, the Lucy community proposes to assimilate the KinoSearch codebase, which has been retrofitted to use Lucy's core. Lucy still lacks a number of important indexing and search classes; we wish to flesh these out via IP clearance work rather than software development. Because Lucene is working to move away from being an umbrella project, a long term goal of the Lucy project is to graduate to an ASF TLP. With that in mind, it seems more appropriate for the KinoSearch software grant to take place within the context of the Incubator, and that a Lucy podling and PPMC be established which will ultimately take responsibility for the codebase. RATIONALE There is great hunger for a search engine library in the mode of Lucene which is accessible from various dynamic languages, and for one accessible from pure C. Individuals naturally wish to code in their language of choice. Organizations which do not have significant Java expertise may not want to support Java strictly for the sake of running a Lucene installation. Developers may want to take advantage of C's interoperability and fine-grained control. Lucy will meet all these demands. Apache is a natural home for our project given the way it has always operated: user-driven innovation, security as a requirement, lively and amiable mailing list discussions, strength through diversity, and so on. We feel comfortable here, and we believe that we will become exemplary Apache citizens. INITIAL GOALS * Make a 1.0 stable release as quickly as possible. * Concentrate on community expansion. * Expose a public C API. CURRENT STATUS Meritocracy Our initial committer list includes two individuals (Peter Karman and Nathan Kurz) who started off as KinoSearch users, demonstrated merit through constructive forum participation, adept negotiation, consensus building, and submission of high-quality contributions, and were invited to become committers. Peter now rolls most
Re: [PROPOSAL] jSpirit Project
Grégoire, no attachment. ML software doesn't like it. I suggest you put it on SF. Otis - Original Message From: Grégoire Rolland grolland.jspi...@gmail.com To: general@incubator.apache.org Sent: Mon, July 19, 2010 4:35:54 AM Subject: Re: [PROPOSAL] jSpirit Project Hello, I append the proposal with several answers about SaaS, Multi-tenancy, standards respect and what is really jSpirit. Thanks to Otis for the feedback. Don't hesitate to send questions, feedback and new ideas, we want to build this project with anyone is interested in the community. Best Regards, Grégoire On 16/07/2010 16:57, Grégoire Rolland wrote: Hello, I'm here to propose a new project for the Apache incubator, related to a previous post I write here. You can find the first draft of the proposal here [ http://wiki.apache.org/incubator/JSpiritProposal ]. We are looking for Champion, Mentors and interrested developpers. And we gracefully ask to the Incubator for sponsoring this project. We were happy to receive your feedback about this proposal. Thanks for your support, we are happy to begin a new work with you ! Best Regards, Here the text of the proposal : = Abstract = jSpirit will be a platform to develop efficiently enterprise class applications for SaaS with real Multi-tenant support and cloud deployement. = Proposal = jSpirit will provide technical foundation on which application developper will create enterprise software distributed as services. jSpirit vill implement global and out-of-box architecture supporting multi-tenancy. As multi-tenancy, I mean architecture that share the same application for multiple client, with support of specifics comportements. The technical foundation will include an integration framework designed for simplify and abstract technical complexity of J2EE for the final developper, a set of tools to industrialize production of applications, a complete applications stack, and a set of methods and recommandation to develop efficiently. = Background = jSpirit was initialy developped for a french company who wants to create a multi-tenant SaaS ERP for trading in the agribusiness world. The application is now finnished and this company opens the codes of the foundation of this project. At this time, there is no foundation framework whose provide multi-tenancy support so it was a need to develop something like jSpirit. The experience of developping such application point there is a need to have tools and method to do this. = Rationale = I think there is a strong need of architecture and simplicity in the java world. The multi-tenancy problems are difficult to resolve and the needs of such application will grow in the future. jSpirit will implements out-of-box architecure, a seamless programming model and technical module to simplify developpement. jSpirit goals is to become a concentrate of experience of open-source and advanced J2EE developpers to provide a platform for efficiently develop application in the SaaS and Multi-tenant world. = Initial Goals = First goal is to develop users and developer community around the project to ensure quality and usability of the platform. Our open-source experience is not high so we think it's important to relies on a community to make the project live. Second goal is to document the project to be more usable as is. Third goal is to enlarge functionnality and make the project more coherent with apache ecosystem. = Current Status = == Code Base == All the code base is here : [[http://sourceforge.net/projects/jspirit/|Sourceforge]]. The current code base implements all functionnalities below. === Architecture === * Multi-tiered Architecture out-of-the-box : Implementation of Integration Layer, Business Layer, Client Layer * Java 5 annotation and auto-injection based lookup of services * Classpath scanning for auto-discovering components * Modular and plugable architecture : automatic activation of modules in the classpath, ready for seamless integration * Implementation of Long-Conversation pattern, with JTA 2PC support (with Geronimo Transaction Manager), and implicit demarcation (explicit demarcation is always possible) * [in progress] AOP interceptor on top of each layer === Integration Layer === * Implementation of abstract integration services and abstract persister based on JPA technology * Maven plugins for code generation of integration layer from xml description of component business model : generate persistent class, access services, queries, constraints, JPA annotation, lucene indexation of business model * bean validation integration * Full Multi-tenancy integration on EntityManager and Caches * Multi-tenant Postgresql support === Business Layer === * Implementation of
Re: New project proposal
Grégoire, Could you please point us/me to some information about jSpirit funcitonality that is SaaS-specific? Understanding that may help people figure out what jSpirit brings and does. For example, if I use jSpirit, which SaaS-specific functionality does a developer not have to develop? What functionality comes out of the box? etc. Thanks, Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message From: Grégoire Rolland grolland.jspi...@gmail.com To: general@incubator.apache.org Sent: Tue, July 13, 2010 4:11:52 AM Subject: New project proposal Hello, I'm the project leader of an open-source project called jSpirit. The goal of the project is to create an open-source platform to develop efficiently enterprise class lightweight J2EE application for SaaS with Multi-tenant support. The code is available here (http://sourceforge.net/projects/jspirit/). The platform focuses on the technical aspect of SaaS and Multi-tenant. I would my project to pretend becoming an Apache Incubator project, and I need help to do this. I think this kind of platform could interest a large community. The goals are to provide open-source application stack (focuses on apache project), tools to develop efficiently, an architectural model for enterprise class application, methods for project management, and an integration framework for rescuing application developper from J2EE and multi-tenant complexity. The project is already used by a french company as a foundation of her ERP (Husson Ingenierie, http://husson-info.fr), it's the base of the community yet. I want to develop my professionnal activity around this project, so it's perennial project, I think. Is there anyone intersted by this project ? Best Regards, -- Grégoire Rolland Projet *jSpirit* Tel : (+33) (0) 6 82 77 59 94 mailto:grolland.jspi...@gmail.com - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [Proposal] JPPF : a parallel processing framework for Java
Was my thinking, too. How long before dc.apache.org (or some variant of it) is formed? Otis -- Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch - Original Message From: Grant Ingersoll gsing...@apache.org To: general@incubator.apache.org Sent: Tue, January 12, 2010 3:50:52 PM Subject: Re: [Proposal] JPPF : a parallel processing framework for Java On Jan 12, 2010, at 1:47 PM, Alan D. Cabrera wrote: On Jan 12, 2010, at 7:27 AM, Grant Ingersoll wrote: On Jan 12, 2010, at 10:12 AM, Emmanuel LŽcharny wrote: Grant Ingersoll a écrit : Seems like this might fit nicely with Hadoop. Has anyone approached their PMC about sponsoring? No, not yet, but that's clearly an option. At least, a better fit than MINA, IMO. Let's do that. Yeah, Hadoop isn't just about Map-Reduce. Just curious, if there's no Hadoop tech in the project then why have it sponsored by Hadoop? I'd add, I think most in Hadoop land view Hadoop as one of the primary places for large scale distributed computing at Apache. Map Reduce is one approach and it does not fit all situations, so I think you'll see other things arise there, possibly JPPF. - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Incubate Lucene Connector Framework
herein described is accepted. MetaCarta patents are not infringed by this grant. Also, MetaCarta trademarks are not included in this grant. External Dependencies The project dependencies, other than on other Apache projects, are as follows: The ConnectorFramework core currently uses the Bitmechanic JDBC pool driver, which is BSD licensed, and the Postgresql JDBC driver, which is also BSD licensed. The LiveLink Connector relies on LAPI, which is privately licensed by OpenText. The Documentum Connector relies on DFC, which is privately licensed by EMC. The Share Connector relies on jCIFS, which is LGPL. The Memex Connector relies on privately licensed java libraries from Memex. The FileNet Connector relies on privately licensed java libraries from IBM. Required Resources • Mailing lists • connectors-private (with moderated subscriptions) • connectors-user@ • connectors-dev@ • connectors-commit@ • Subversion directory • https://svn.apache.org/repos/asf/incubator/connectors • Website • Confluence (CONNECTORS) • Issue Tracking • JIRA (CONNECTORS) Initial Committers Names of initial committers with affiliation and current ASF status: • Karl Wright (kwright at metacarta) • Josiah Strandberg (jstrandberg at metacarta) • Ken Baker (bakerkj at metacarta) • Marc Meadows (mam at metacarta) • Grant Ingersoll ( gsing...@a.o Lucid Imagination, ASF Member) • Brian Pinkerton (brian.pinkerton at Lucid Imagination) • Simon Willnauer (simonw at apache org, Committer on Lucene Java and Lucene Open Relevance Project) • Ryan McKinley (ryan at apache org, Committer on Lucene and Solr) • Robert Muir (rmuir at apache org, Committer on Lucene and Open Relevance) • Sami Siren ( si...@a.o , Committer on Nutch and Tika) • Otis Gospodnetic ( o...@a.o , Committer on Lucene, Solr, Nutch, Mahout, and Open Relevance Project) • Shalin Shekhar Mangar ( sha...@a.o , AOL, Committer on Apache Solr) • Noble Paul ( no...@a.o , AOL, Committer on Apache Solr) • George Aroush (george at aroush.net, Committer on Lucene.Net) Sponsors Champion • Grant Ingersoll Nominated Mentors • Grant Ingersoll • Jukka Zitting • Gianugo Rabellino Sponsoring Entity • Apache Lucene PMC: Message ID: af7e...@gmail.com in priv...@lucene.a.o - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Graduate Lucene.Net as a subproject under Apache Lucene
+1 - Original Message From: George Aroush geo...@aroush.net To: general@incubator.apache.org Sent: Wed, October 7, 2009 9:59:43 PM Subject: [VOTE] Graduate Lucene.Net as a subproject under Apache Lucene Hi Folks, On behalf of Lucene.Net mentor, committers and community, this is a vote call to graduate the Lucene.Net project (http://incubator.apache.org/lucene.net/) as a sub-project under Apache Lucene. The Lucene.Net mentor, committers, and the community have voted like so: +1 from Erik Hatcher (mentor) +1 from George Aroush (committer) +1 from Isik YIGIT (aka: DIGY) (committer) +1 from Doug Sale (committer) +1 from a total of 70+ Lucene.Net members / followers / users. (with no -1 or 0 votes) The vote result can be found here: http://mail-archives.apache.org/mod_mbox/incubator-lucene-net-user/200909.mb ox/%3c166a01ca3739$13947380$3abd5a...@net%3e The rationale for graduation is: * Lucene.Net has been under incubation since April 2006 (3 1/2 years now). * During incubation, Lucene.Net has: - Made, 1 official release (Incubating-Apache-Lucene.Net-2.0-004-11Mar07). - Released, as SVN tag, 18 ports of Java Lucene (from 1.9 to 2.4.0). - Released, as SVN tag, port of WordNet.Net 2.0, SpellChecker.Net 2.0, Snowball.Net 2.0, and Highlighter.Net 2.0. - Released, MSDN style documentation for the above release. - Accepted, two new committers: Isik YIGIT (DIGY) digydigy @ gmail.com and Doug Sale dsale @ myspace-inc.com were added in November 2008 (George Aroush george @ aroush.net is the original committer). - The community has grown, with a healthy followers. - Is being used by well established companies in production (I'm not sure what's the legality to mention their names here, or even if I have the complete list). - Is being used by Beagle project. * Work is already under way to port Java Lucene 2.9 to Lucene.Net 2.9 If this graduation is approved, Lucene.Net will be officially called Apache Lucene.Net Please cast your votes: [ ] +1 Graduate Lucene.Net as a sub-project under Apache Lucene. [ ] -1 Lucene.Net is not ready to graduate as a sub-project under Apache Lucene, because ... This vote will close on October 17th, 2009. Regards, -- George - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept Wink proposal for incubation
+1 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Nicholas L Gallardo nlgal...@us.ibm.com To: general@incubator.apache.org Cc: Bryant Luk b...@us.ibm.com; Christopher J Blythe cjbly...@us.ibm.com; Dustin Amrhein damr...@us.ibm.com; Baram, Eliezer eba...@hp.com; el...@hp.com; Greg Truty gtr...@us.ibm.com; Jesse A Ramos jra...@us.ibm.com; Snitkovsky, Martin martin.snitkov...@hp.com; Michael Rheinheimer r...@us.ibm.com; nadav.fisc...@hp.com; tali.alsaigh-co...@hp.com; tomer.sh...@hp.com Sent: Friday, May 15, 2009 11:54:35 AM Subject: [VOTE] Accept Wink proposal for incubation Dear Incubator PMC Members, The Wink team would like to officially present the proposal for the Wink REST runtime for incubation in the Apache Incubator. This proposal has been surfaced previously and is also available at: http://wiki.apache.org/incubator/WinkProposal Please cast your votes: [ ] +1, Accept Wink for incubation [ ] +0, Indifferent to Wink incubation [ ] -1, Reject Wink for incubation (if so, please help us understand why) The formal proposal, included below, provides supporting details on why this proposal is coming forward and who is involved. Thanks and cheers on behalf of the team. - Abstract - Apache Wink is a project that enables development and consumption of REST style web services. The core server runtime is based on the JAX-RS (JSR 311) standard. The project also introduces a client runtime which can leverage certain components of the server-side runtime. Apache Wink will deliver component technology that can be easily integrated into a variety of environments. - Proposal - Apache Wink is a project that enables and simplifies development of REST style HTTP based services. The project includes both server and client side components that can be used independently of each other. The server side is a stand-alone component that integrates easily with many existing application servers. The client side API enables the user to develop applications that interact with server resources in a RESTful manner. The goal is to provide component technology for both RESTful services and clients that can be used in a number of contexts. These contexts could range from a full Java EE runtime environment (Geronimo) to a J2SE environment with a simple HTTP listener service. The server component of Apache Wink will implement a TCK compliant version of the JAX-RS standard defined by JSR 311 (https://jsr311.dev.java.net/). The client side component provides a rich API for quickly developing applications that access and update server resources using JAX-RS requests. The API can accommodate data returned in several popular formats including JSON, XML, ATOM, HTML and CSV. Plans for future extensions are currently being discussed, but include a focus on ease of use through service discovery and quality of service configuration (security, caching). - Background - Over the past decade, the Representational State Transfer (REST) architectural style of web services has been gaining popularity. Introduced by Roy Fielding in 2000, the idea of providing simple HTTP based access to server resources has continued to grow even as other, more complex web service architectures have been published. The JSR 311 standard ( https://jsr311.dev.java.net) defines a standard set of annotations and a programming model for exposing java resources as REST-based resources. With the recent approval of the standard and its inclusion in Java EE 6, the use of REST and its Java programming standard (JAX-RS), will certainly be growing in the near future. As such, there will be a demand for an Apache friendly, open source implementation of the standard. Apache Wink seeks to provide this implementation in an independent manner that is not tied to any platform. - Rationale - The rationale for the project is to build an implementation of the JAX-RS specification in open source that can be certified by the applicable TCKS (JSR-311). The project would also provide integration with Geronimo and other open source-based REST communities. Building a strong, vendor-neutral community is important to the project so it that will outlast any one person's or company's participation. Code released from the project will also provide a basis to prototype and build new extensions that could eventually be taken for standardization as an extension to the JSR 311 work (such as a client API). However, the server side is only half of the equation. Once the server provides access to a resource, there needs to be clients to access and utilize the data. As such, we want to provide a well rounded package that also
Re: [PROPOSAL] Apache SocialSite
Another +1 from me, too. SocialSite needs to live on and Apache could be a good home for it. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jamey Wood jamey.w...@gmail.com To: general@incubator.apache.org Cc: Eduardo Pelegri-Llopart pele...@sun.com; rovagn...@gmail.com; Robert Bissett robert.biss...@sun.com; leandro.milma...@globant.com; rodr...@globant.com; Tony Ng tony...@sun.com Sent: Thursday, April 23, 2009 11:47:00 AM Subject: Re: [PROPOSAL] Apache SocialSite I'm very much +1 on this. I appreciate Sun's willingness to contribute the existing SocialSite code, and I hope that we'll have the opportunity to evolve it under the Incubator's established governance model and level playing field. --Jamey On Wed, Apr 8, 2009 at 7:41 PM, Dave wrote: Greetings to all, It's my pleasure to present to you a proposal for a new project Apache SocialSite, a social networking service based on Apache Shindig (incubating) with an end-user interface composed entirely of OpenSocial gadgets and designed to add social networking features to existing web applications (e.g. Roller, JSPWiki, your favorite webapp, etc.). You can find the full proposal on the Incubator wiki: http://wiki.apache.org/incubator/SocialSiteProposal I look forward to your comments and suggestions on this proposal. Thanks, Dave - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: UIMA [WAS Re: Suspending Projects]
My GSOC suggestion was related to the earlier comment that for some reason UIMA is unable to engage more contributors and convert them into committers (GSoC could help get some fresh blood), not the branch tied to the academia. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Ross Gardler rgard...@apache.org To: general@incubator.apache.org Sent: Saturday, February 21, 2009 10:40:12 PM Subject: Re: UIMA [WAS Re: Suspending Projects] 2009/2/21 Otis Gospodnetic : Perhaps GSOC is something to consider. I see UIMA didn't have anything in 2008: http://wiki.apache.org/general/SummerOfCode2008 GSoC is entirly separate from the academics, it is for students. The problems expressed in this thread are with respect to research staff who produce software as part of their work. I agree the GSoC mentoring idea can be used to do educate the staff within university. I'm currently seeking funding for doing something with this, but my existing project focuses on the research staff being discussed here. Ross Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Thilo Goetz To: general@incubator.apache.org Sent: Friday, February 20, 2009 4:31:50 PM Subject: Re: UIMA [WAS Re: Suspending Projects] Niclas Hedhman wrote: On Fri, Feb 20, 2009 at 6:58 AM, Robert Burrell Donkin wrote: We should probably try to find the collective energy to review UIMA before the project's enthusiasm is sapped. That sounds like a healthy observation. My Q for the community is; Do you have a healthy and diverse set of users? If so, have the UIMA team looked at What is stopping these users from becoming contributors? ? Yes, we do have a healthy and diverse user community. We have racked our brains what we could do to attract more community contribution. We've created a sandbox to facilitate the inclusion of experimental technology. There's been some uptake, but not enough. Some of us are working on scale out via JMS and are hoping to attract contributions in that area. We've started discussions and suggested things for people to work on. I don't know, maybe we're going about this the wrong way. My pet hypothesis (or maybe I'm just looking for excuses) is this: UIMA is heavily used in academia. Now academics have no problems with open source, to the contrary. But they have an overwhelming need to publish and build up a reputation. So they like to publish their source code on their own web site, where it's clear it's their work, rather than contribute to some community effort. If you look around, you'll see all manner of university efforts around UIMA, but very little of that code finds its way back into the ASF repo. Enough whining. If you have any suggestions, we'll be happy to hear them. --Thilo I could imagine a whole range of reasons, and if that is 'fixed' the diversity comes with it... If there is not a user community, then I would be concerned to graduate the project with the large set of single-employer committers. Cheers Niclas - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org -- -- Ross Gardler OSS Watch - awareness and understanding of open source software development and use in education http://www.oss-watch.ac.uk - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: UIMA [WAS Re: Suspending Projects]
Perhaps GSOC is something to consider. I see UIMA didn't have anything in 2008: http://wiki.apache.org/general/SummerOfCode2008 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Thilo Goetz twgo...@gmx.de To: general@incubator.apache.org Sent: Friday, February 20, 2009 4:31:50 PM Subject: Re: UIMA [WAS Re: Suspending Projects] Niclas Hedhman wrote: On Fri, Feb 20, 2009 at 6:58 AM, Robert Burrell Donkin wrote: We should probably try to find the collective energy to review UIMA before the project's enthusiasm is sapped. That sounds like a healthy observation. My Q for the community is; Do you have a healthy and diverse set of users? If so, have the UIMA team looked at What is stopping these users from becoming contributors? ? Yes, we do have a healthy and diverse user community. We have racked our brains what we could do to attract more community contribution. We've created a sandbox to facilitate the inclusion of experimental technology. There's been some uptake, but not enough. Some of us are working on scale out via JMS and are hoping to attract contributions in that area. We've started discussions and suggested things for people to work on. I don't know, maybe we're going about this the wrong way. My pet hypothesis (or maybe I'm just looking for excuses) is this: UIMA is heavily used in academia. Now academics have no problems with open source, to the contrary. But they have an overwhelming need to publish and build up a reputation. So they like to publish their source code on their own web site, where it's clear it's their work, rather than contribute to some community effort. If you look around, you'll see all manner of university efforts around UIMA, but very little of that code finds its way back into the ASF repo. Enough whining. If you have any suggestions, we'll be happy to hear them. --Thilo I could imagine a whole range of reasons, and if that is 'fixed' the diversity comes with it... If there is not a user community, then I would be concerned to graduate the project with the large set of single-employer committers. Cheers Niclas - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: UIMA [WAS Re: Suspending Projects]
+1. We have this attitude over at Lucene and I think it works well. We also have HowToContribute pages for both Lucene and Solr and regularly point people to it. We also encourage contribution via Perhaps you can open a JIRA issue and contribute your patch type of suggestions on the ML. We give credit to all contributors and always thank them. Lucene has been around for about 10 years now. It's a very healthy and active project. It now has lots and lots of contributors, but it took some time to get there. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: William A. Rowe, Jr. wr...@rowe-clan.net To: general@incubator.apache.org Sent: Saturday, February 21, 2009 12:57:09 AM Subject: Re: UIMA [WAS Re: Suspending Projects] Robert Burrell Donkin wrote: I know that we usually try to strongly encourage companies not to use apache as dumping ground but I wonder sometimes whether it might be useful to accept more contributions of proof-of-concept code especially from academia. It's often easier to start from some proof- of-concept code than from scratch. +1 - let's please not forget that NCSA httpd was exactly that, a code dump (well, fork/import) of an abandoned work :) - To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org For additional commands, e-mail: general-h...@incubator.apache.org
Re: [VOTE] Accept Cassandra into the Incubator
+1 Otis-- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Ian Holsman li...@holsman.net To: general@incubator.apache.org Sent: Tuesday, December 23, 2008 5:01:37 PM Subject: [VOTE] Accept Cassandra into the Incubator Dear Incubator PMC, There has been some discussion around the Cassandra proposal, and we would now like to officially propose Cassandra to the Incubator for consideration.. Please vote on accepting Cassandra project for incubation. The full Cassandra proposal is available at the end of this message and as a wiki page at http://wiki.apache.org/incubator/Cassandra. We ask the Incubator PMC to sponsor the Cassandra podling, with Brian as the Champion, and Torsten, Matthieu, and Ian volunteering to mentor as well. The vote is open for the next 72 hours and only votes from the Incubator PMC are binding. [ ] +1 Accept Cassandra as a new podling [ ] -1 Do not accept the new podling (provide reason, please) = Abstract = Cassandra is a distributed storage system for managing structured/unstructured data while providing reliability at a massive scale. = Background = Development of Cassandra started in Facebook in June 2007. It started of a system to solve the Inbox Search problem and since then has matured to solve various storage problems associated with structured/unstructured data. = Rationale = Cassandra is a distributed storage system for managing structured data that is designed to scale to a very large size across many commodity servers, with no single point of failure. The philosophy behind the design of the storage portion of Cassandra is that it be able to satisfy the requirements of applications that demand storage of large amounts of structured data. Reliability at massive scale is a very big challenge. Outages in the service can have significant negative impact. Hence Cassandra aims to run on top of an infrastructure of hundreds of nodes (possibly spread across different datacenters). At this scale, small and large components fail continuously; the way Cassandra manages the persistent state in the face of these failures drives the reliability and scalability of the software systems relying on this service. = Initial Source = Intial Source can be obtained from the following site - http://the-cassandra-project.googlecode.com/svn/branches/development/. The mailing list is currently maintained at the same site. We will move it over to Apache once this proposal has been accepted. = Source and Intellectual Property Submission Plan = = External Dependencies = * All dependencies have Apache compatible licenses. Dependencies are log4j, Thrift, Apache Commons. = Cryptography = * None = Committers = * Avinash Lakshman * Prashant Malik * Kannan Muthukkaruppan * Jiansheng Huang * Dan Dumitriu = Current Status = == Meritocracy == * Though initial development was done at Facebook, Cassandra was intended to be released as an open source project from its inception. Environment will lend itself to support meritocracy at all times. == Community == * Folks who are actively considering deploying/prototyping Cassandra in their respective organizations. == Core Developers == * Avinash Lakshman * Prashant Malik * Kannan Muthukkaruppan == License == * The Cassandra codebase is Apache 2.0 licensed, and currently hosted at Google Code. = Known Risks/Avoiding the Warning Signs = == Orphaned Products == * Cassandra is already deployed within Facebook and many other organizations are actively moving to deploy this in production. Original developers are and will actively stay involved and hence there is no realistic chance of it getting orphaned. == Homogenous Developers == * The current list of committers includes developers from different companies. The committers are geographically distributed across the U.S. == Reliance on Salaried Developers == * Yes. But don't expect this to be a risk of any nature. == Relationships with Other Apache Products == * The Cassandra project is 'similar' to hbase/HDFS in concept, but Cassandra is more geared for Online web site usage than batch. It also doesn't have a single point of failure, which makes it interesting as well. * Cassandra makes use of the Thrift project. == An excessive fascination with the Apache brand == * Cassandra has already attracted a stable base of users. There are at least 3 companies who are planning to use Cassandra in production as far as we know. The reasons for joining Apache are not to advertise the project, but rather to demonstrate the commitment to open source by divorcing the trunk from any one corporation and pursuing further integration with other Apache projects. = Required Resources = == Mailing lists == Once the project is approved, the following mailing lists will be used
Re: Cassandra Incubator Proposal
Hello, The distributed storage system for managing structured/unstructured data while providing reliability at a massive scale. part sounds kind of like HDFS. Would it be possible to describe how Cassandra is different from HDFS? Perhaps the best place to do it is under the Relationships with Other Apache Products section. Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Avinash Lakshman [EMAIL PROTECTED] To: general@incubator.apache.org general@incubator.apache.org Cc: Prashant Malik [EMAIL PROTECTED]; Kannan Muthukkaruppan [EMAIL PROTECTED] Sent: Monday, December 1, 2008 7:51:28 PM Subject: Cassandra Incubator Proposal Hi Folks Please consider our proposal to move the Cassandra project into the Incubation process - http://wiki.apache.org/incubator/Cassandra. Please advice, as to what else is required for us complete this process. Cheers Avinash - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [Vote] accept Droids into incubation
+1 Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Thorsten Scherler [EMAIL PROTECTED] To: Incubator general@incubator.apache.org Sent: Thursday, October 2, 2008 4:00:41 PM Subject: [Vote] accept Droids into incubation Please vote on accepting Droids into incubation. The proposal can be found at: http://wiki.apache.org/incubator/DroidsProposal The text of the proposal = Droids, an intelligent standalone robot framework = === Abstract === Droids aims to be an intelligent standalone robot framework that allows to create and extend existing droids (robots). === Proposal === As a standalone robot framework Droids will offer infrastructure code to create and extend existing robots. In the future it will offer as well a web based administration application to manage and controll the different droids which will communicate with this app. Droids makes it very easy to extend existing robots or write a new one from scratch, which can automatically seek out relevant online information based on the user's specifications. Since the flexible design it can reuse directly all custom business logic that are written in java. In the long run it should become umbrella for specialized droids that are hosted as sub-projects. Where an ultimate goal is to integrate an artificial intelligence that can control a swarm of droids and actively plan/react on different tasks. === Background === The initial idea for the Droids project was voiced in February 2007 from Thorsten Scherler mainly because of personal curiosity and developed as a labs project. The background of his work was that Cocoon trunk (2.2) did not provide a crawler anymore and Forrest was based on it, meaning we could not update anymore till we found a crawler replacement. Getting more involved in Solr and Nutch he saw the request for a generic standalone crawler. For the first version he took nutch, ripped out and modified the plugin/extension framework. However the second version were not based on it anymore but was using Spring instead. The main reason was that Spring has become a standard and helped to make Droids as extensible as possible. Soon the first plugins and sample droids had been added to the code based. === Rationale === There is ever more demand for tools that automatically do determinate tasks. Search engines such as Nuts are normally very focused on a specific functionality and are not focused on extensibility. Furthermore there are manly focused on crawling, requesting certain pages and extract links to other pages, which in our opinion is only one small area for automated robots. While there are a number of existing crawler libraries for various task, each of them comes with a custom API and there are no generic interface for automatically determining which crawler (droids) to use for a specific task. The Droids project attempts to remove this duplication of efforts. We believe that by pooling the efforts of multiple projects we will be able to create a generic robot framework that exceeds the capabilities and quality of the custom solutions of any single project. The focus of Droids is not a single crawler but more to offer different reusable components that custom droids (robots) can use to automate certain tasks. An intelligent standalone robot framework project will not only provide common ground for the developers of crawler but as well for any other automated application (robots) libraries. === Initial Goals === The initial goals of the proposed project are: * Viable community around the Droids codebase * Active relationships and possible cooperation with related projects and communities (e.g. reusing Tika for text extraction) * Generic robot API for crawling, extracting structured text content and/or new task, filtering task and handle the content * Flexible extension and plugin development to create a wide range of functionality * Fuel develop of various droids and bring the current wget style crawler to state-of-the-art level == Current Status == === Meritocracy === All the initial committers are familiar with the meritocracy principles of Apache, and have already worked on the various source codebases. We will follow the normal meritocracy rules also with other potential contributors. === Community === There is not yet a clear Droids community. Instead we have a number of people and related projects with an understanding that an intelligent standalone robot framework project would best serve everyone's interests. The primary goal of the incubating project is to build a self-sustaining community around this shared vision. === Core Developers === The initial set of developers comes from various backgrounds, with different but compatible needs for the proposed project. === Alignment === As a generic robot framework Droids will likely be
Re: [PROPOSAL] Droids
This sounds good to me. Are you planning to run Droids on top of Hadoop? If not, why not? Thanks, Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Thorsten Scherler [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Monday, September 22, 2008 4:24:55 PM Subject: [PROPOSAL] Droids This is a proposal to enter the incubator. See http://wiki.apache.org/incubator/DroidsProposal for the most up-to-date version. As Champion we have Grant Ingersoll from the ASF. Droids is an Apache Labs project and we are still looking for some mentors for this proposal. We look forward to comments and discussion. = Droids, an intelligent standalone robot framework = === Abstract === Droids aims to be an intelligent standalone robot framework that allows to create and extend existing droids (robots). === Proposal === As a standalone robot framework Droids will offer infrastructure code to create and extend existing robots. In the future it will offer as well a web based administration application to manage and controll the different droids which will communicate with this app. Droids makes it very easy to extend existing robots or write a new one from scratch, which can automatically seek out relevant online information based on the user's specifications. Since the flexible design it can reuse directly all custom business logic that are written in java. In the long run it should become umbrella for specialized droids that are hosted as sub-projects. Where an ultimate goal is to integrate an artificial intelligence that can control a swarm of droids and actively plan/react on different tasks. === Background === The initial idea for the Droids project was voiced in February 2007 from Thorsten Scherler mainly because of personal curiosity and developed as a labs project. The background of his work was that Cocoon trunk (2.2) did not provide a crawler anymore and Forrest was based on it, meaning we could not update anymore till we found a crawler replacement. Getting more involved in Solr and Nutch he saw the request for a generic standalone crawler. For the first version he took nutch, ripped out and modified the plugin/extension framework. However the second version were not based on it anymore but was using Spring instead. The main reason was that Spring has become a standard and helped to make Droids as extensible as possible. Soon the first plugins and sample droids had been added to the code based. === Rationale === There is ever more demand for tools that automatically do determinate tasks. Search engines such as Nuts are normally very focused on a specific functionality and are not focused on extensibility. Furthermore there are manly focused on crawling, requesting certain pages and extract links to other pages, which in our opinion is only one small area for automated robots. While there are a number of existing crawler libraries for various task, each of them comes with a custom API and there are no generic interface for automatically determining which crawler (droids) to use for a specific task. The Droids project attempts to remove this duplication of efforts. We believe that by pooling the efforts of multiple projects we will be able to create a generic robot framework that exceeds the capabilities and quality of the custom solutions of any single project. The focus of Droids is not a single crawler but more to offer different reusable components that custom droids (robots) can use to automate certain tasks. An intelligent standalone robot framework project will not only provide common ground for the developers of crawler but as well for any other automated application (robots) libraries. === Initial Goals === The initial goals of the proposed project are: * Viable community around the Droids codebase * Active relationships and possible cooperation with related projects and communities (e.g. reusing Tika for text extraction) * Generic robot API for crawling, extracting structured text content and/or new task, filtering task and handle the content * Flexible extension and plugin development to create a wide range of functionality * Fuel develop of various droids and bring the current wget style crawler to state-of-the-art level == Current Status == === Meritocracy === All the initial committers are familiar with the meritocracy principles of Apache, and have already worked on the various source codebases. We will follow the normal meritocracy rules also with other potential contributors. === Community === There is not yet a clear Droids community. Instead we have a number of people and related projects with an understanding that an intelligent standalone robot framework project would best serve everyone's interests. The primary goal of the incubating project is to build a self-sustaining community around this shared
Re: [PROPOSAL] Etch
Grant is right. Others were making the point about Debian. I was making a point about it being an overly generic English word. The former may be a short term problem, the latter a very long term one. If changing the name is such a problem (work) then... Otis P.S. I, too, like the name Etch, it's just that it's an English word, plus there are several other products with Etch in their name. - Original Message From: Grant Ingersoll [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Friday, August 8, 2008 6:28:23 AM Subject: Re: [PROPOSAL] Etch On Aug 8, 2008, at 4:28 AM, James Dixson (jadixson) wrote: Simple put: a name change is work. Before I can accept the need to do work, I want to clearly understand the benefits of doing it. Etch, while new to open-source, does have some awareness in a technical community ( http://developer.cisco.com/web/cuae ). We have been publicly pitching and distributing etch in our community for several months now. People have been using the technology and for our current community Etch != Debian. Granted, a couple of months is a short amount of time, but it is something. Imposing a name change on our current community, with the reasoning that the future community, would be unable to differentiate between Apache Etch and the etch release Debian, would be disruptive. I don't think the argument is necessarily that the future community can't distinguish between Apache Etch and Debian, I think the argument is that the future community won't be able to find it, period, which means the future community may well be smaller than it would be w/ a more distinctive name. Put it this way, you search for Hadoop, the top 10 on Google is all Apache Hadoop. You search for Etch and you will be lucky to crack the top 10, me thinks, but who knows maybe you'll get enough rank to displace the Etch-a-Sketch and it will be a non-issue. Of course, the work thing I understand, too, although it seems like a global search and replace wouldn't be that bad. You also certainly could change it over time, even after being accepted into incubation, I think, just as long as it's done before first release. FWIW, I like the name Etch :-) -Grant - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [PROPOSAL] Etch
http://www.google.com/search?q=etch -- 12MM hits for an unreleased product (this Etch) http://www.google.com/search?q=hadoop -- 600K hits for a product that's been out for over a year Why such resistance to name change? There is this proverb that might be suitable here. The loose translation is: When one person tells you you are a donkey, ignore him. When two people tell you you are a donkey, go buy yourself a saddle. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Scott Comer (sccomer) [EMAIL PROTECTED] To: general@incubator.apache.org; general@incubator.apache.org Sent: Thursday, August 7, 2008 2:50:37 PM Subject: Re: [PROPOSAL] Etch Doug is a wise man and that is how we picked the name etch 18 month ago. Scott out -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent:Wednesday, August 06, 2008 12:12 PM Pacific Standard Time To:general@incubator.apache.org Subject:Re: [PROPOSAL] Etch Doug Cutting has a few nice and short naming rules that I liked when I read them. I believe one of them was that a Google search for the proposed name should yield very few matches. Hadoop and Lucene are/were good examples of that. Here is another naming example. I created this bookmarking service called simpy - simpy.com . Great name - short, memorable, easy to spell, etc. people said. The problem with it is that simpy is a common misspelling of simply. So, another thing to keep in mind. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: James Dixson (jadixson) To: general@incubator.apache.org Sent: Tuesday, August 5, 2008 6:03:28 PM Subject: RE: [PROPOSAL] Etch I have heard the name concern a couple of times now... When we picked the name Etch about 18 months ago, we knew about the Debian release, but frankly we were unconcerned. Debian etch is the name of a release of Debian, no different than 4 being a release of Fedora. Eventually Debian etch will fade into memory just as sarge and woody have. Etch, in our line of thinking (if it could be said we were doing any thinking :-) ) was the name we were giving to the technology, not a release. There were no other technologies named Etch or similar, so we declared victory and moved on. So I guess my question back to everyone is this: What is the concern about the name Etch, really? 1. Is there a legal trademark issue or a formal Apache branding policy issue with using the name Etch such that the use of the name is simply not going to be allowed. - or - 2. Is there a concern that during incubation, we might have to be explicit in communication and always say Apache Etch rather than just Etch because of a fear that a reference to Etch, taken out of context, could be confused with Debian 4.0? If 1 is true then, absolutely the name should be changed. But if only 2 is true, then I will need a bit more convincing. I am after all very, very lazy and changing the name of a working toolset is well... work :-) --- James Dixson Manager, Software Development CUAE Engineering, Cisco Systems Inc. Direct: 512-336-3305 Mobile: 512-968-2116 [EMAIL PROTECTED] -Original Message- From: Niklas Gustavsson [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 05, 2008 3:33 PM To: general@incubator.apache.org Subject: Re: [PROPOSAL] Etch On Thu, Jul 31, 2008 at 6:16 PM, James Dixson wrote: This a proposal to enter Etch in to the incubator. See http://wiki.apache.org/incubator/EtchProposal for updates. +1 for incubation (non-binding). While I find this area to be a bit overcrowded lately, having both Etch and Thrift at Apache and Protocol buffers under ASL 2.0 does offer some interesting opportunities for competition as well as cooperation. I do share the concerns about naming conflicts. Debian is by far more well known and trying to establish this project under a conflicting name would be hard. /niklas - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [PROPOSAL] Etch
Doug Cutting has a few nice and short naming rules that I liked when I read them. I believe one of them was that a Google search for the proposed name should yield very few matches. Hadoop and Lucene are/were good examples of that. Here is another naming example. I created this bookmarking service called simpy - simpy.com . Great name - short, memorable, easy to spell, etc. people said. The problem with it is that simpy is a common misspelling of simply. So, another thing to keep in mind. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: James Dixson (jadixson) [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Tuesday, August 5, 2008 6:03:28 PM Subject: RE: [PROPOSAL] Etch I have heard the name concern a couple of times now... When we picked the name Etch about 18 months ago, we knew about the Debian release, but frankly we were unconcerned. Debian etch is the name of a release of Debian, no different than 4 being a release of Fedora. Eventually Debian etch will fade into memory just as sarge and woody have. Etch, in our line of thinking (if it could be said we were doing any thinking :-) ) was the name we were giving to the technology, not a release. There were no other technologies named Etch or similar, so we declared victory and moved on. So I guess my question back to everyone is this: What is the concern about the name Etch, really? 1. Is there a legal trademark issue or a formal Apache branding policy issue with using the name Etch such that the use of the name is simply not going to be allowed. - or - 2. Is there a concern that during incubation, we might have to be explicit in communication and always say Apache Etch rather than just Etch because of a fear that a reference to Etch, taken out of context, could be confused with Debian 4.0? If 1 is true then, absolutely the name should be changed. But if only 2 is true, then I will need a bit more convincing. I am after all very, very lazy and changing the name of a working toolset is well... work :-) --- James Dixson Manager, Software Development CUAE Engineering, Cisco Systems Inc. Direct: 512-336-3305 Mobile: 512-968-2116 [EMAIL PROTECTED] -Original Message- From: Niklas Gustavsson [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 05, 2008 3:33 PM To: general@incubator.apache.org Subject: Re: [PROPOSAL] Etch On Thu, Jul 31, 2008 at 6:16 PM, James Dixson wrote: This a proposal to enter Etch in to the incubator. See http://wiki.apache.org/incubator/EtchProposal for updates. +1 for incubation (non-binding). While I find this area to be a bit overcrowded lately, having both Etch and Thrift at Apache and Protocol buffers under ASL 2.0 does offer some interesting opportunities for competition as well as cooperation. I do share the concerns about naming conflicts. Debian is by far more well known and trying to establish this project under a conflicting name would be hard. /niklas - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [DISCUSSION] Hama Proposal
Edward, I was going to email you about this weeks ago, when I first saw this proposal. You are working in a vacuum too much, I think that's the main problem. You are mentioning private e-mails, and that doesn't sound right. Bring this up in the open on [EMAIL PROTECTED] and state what you'd like. I, like Yonik and Grant, feel that this fits very well under Hadoop, either as a sub-project or a simply a contrib. I *believe* that if you go the sub-project route, and especially if you simply make Hama a Hadoop contrib, no incubation is necessary, as long as Hadoop PMC welcomes the code. Much simpler. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: edward yoon [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Tuesday, March 18, 2008 7:02:23 PM Subject: Re: [DISCUSSION] Hama Proposal Do you have a mail thread reference? This seems small enough in scope and so tied to Hadoop that it seems like it should either just be part of one of the hadoop sub-projects or at a maximum, a hadoop sub-project of it's own. http://www.mail-archive.com/[EMAIL PROTECTED]/msg00136.html But, i much talked about it via private e-mail. They gave me a welcome, However, They also all share the need to make incubation progress. Thanks, Edward. On 3/19/08, Yonik Seeley [EMAIL PROTECTED] wrote: This seems small enough in scope and so tied to Hadoop that it seems like it should either just be part of one of the hadoop sub-projects or at a maximum, a hadoop sub-project of it's own. I see you opened https://issues.apache.org/jira/browse/HADOOP-2878 what's the status of that? -Yonik On Tue, Mar 18, 2008 at 8:02 AM, edward yoon [EMAIL PROTECTED] wrote: Dear Incubator PMC, I've updated the Hama project proposal. Please review/update as needed and report back any concerns. http://wiki.apache.org/incubator/HamaProposal Hama has a strong relationship with the hadoop, hbase and mahout project, so i discussed about become a sub-project of these project for a long time with the each community. However, The sub-project was beset with difficulties. Hence the list of committers, etc. And now we all agree that the Hama should aim to general purpose rathen than it becomes a specified piece in something. http://www.nabble.com/-jira--Created%3A-%28MAHOUT-16%29-Hama-contrib-package-for-the-mahout-to15998717.html https://issues.apache.org/jira/browse/HADOOP-2878 If you think this will make a good ASF project, please encourage our team members to create the world's largest matrix computational framework. Thanks. B. Regards, Edward yoon @ NHN, corp. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- B. Regards, Edward yoon @ NHN, corp. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: hit counters for incubating web sites
Ah, good question (don't have the answer). Personally, I'd love to see Apache.org stats via something like Google Analytics, but perhaps this can be a per-project thing. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Marshall Schor [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Tuesday, January 15, 2008 11:27:01 AM Subject: hit counters for incubating web sites Various posts in the past have expressed interest in collecting statistics on usage, or downloads. Previous replies have pointed out that counting downloads is inaccurate, because Apache licensed components can be redistributed by others, and the Apache mirroring system means that most downloads occur from non-Apache machines. We would like to get some statistical information about downloads, and are thinking that counting clicks on the download button(s) would be a good way (it would avoid the problem of missing mirroring). Although not perfect (it would miss repackaging/redistribution, and other sites which link to a download page other than our own), we think it would be somewhat useful, at least as a lower bound of interest. Vadim Gritsenko has a stats site on people.a.o, looking at downloads by extracting data from web server logs. He has said, however, that he won't track individual incubator projects, just TLP. See http://people.apache.org/~vgritsenko/stats/index.html and http://people.apache.org/~vgritsenko/faq.html . Most of the hit counters out there seem to be snippets of html you add to your web page, which go off to someone else's server, where the counting happens. Is there a service running on an apache server (e.g., people.a.o), which we can use for hit-counting? If so, can someone post the html needed to use it? -Marshall - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [DISCUSS] PDFBox proposal
Sounds like a good addition to ASF! Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Jukka Zitting [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Wednesday, November 14, 2007 8:08:33 PM Subject: [DISCUSS] PDFBox proposal Hi, Ben Litchfield, the author of the PDFBox library, has been working with us at the ApacheCon preparing a proposal to bring PDFBox into the Apache Incubator. See http://wiki.apache.org/incubator/PDFBoxProposal for the current draft of the proposal. Some of the details are yet to be worked out, but the general idea is there. All comments and questions are welcome! BR, Jukka Zitting - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [PROPOSAL] Shindig, an OpenSocial Container
+1 -- my simpy.com might become a client soon. :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Brian McCallister [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Friday, November 9, 2007 1:03:49 PM Subject: [PROPOSAL] Shindig, an OpenSocial Container Shindig Proposal -- = Abstract = Shindig will develop the container and backend server components for hosting OpenSocial applications. = Proposal = Shindig will develop a JavaScript container and implementations of the backend APIs and proxy required for hosting OpenSocial applications. = Background = OpenSocial provides a common set of APIs for social applications across multiple websites. With standard JavaScript and HTML, developers can create social applications that use a social network's friends and update feeds. A social application, in this context, is an application run by a third party provider and embedded in a web page, or web application, which consumes services provided by the container and by the application host. This is very similar to Portal/Portlet technology, but is based on client-side compositing, rather than server. More information can be found about OpenSocial at http://code.google.com/apis/opensocial/ == Rationale == Shindig is an implementation of an emerging set of APIs for client-side composited web applications. The Apache Software Foundation has proven to have developed a strong system and set of mores for building community-centric, open standards based systems with a wide variety of participants. A robust, community-developed implementation of these APIs will encourage compatibility between service providers, ensure an excellent implementation is available to everyone, and enable faster and easier application development for users. The Apache Software Foundation has proven it is the best place for this type of open development. = Current Status = This is a new project. = Meritocracy = The initial developers are very familiar with meritocratic open source development, both at Apache and elsewhere. Apache was chosen specifically because the initial developers want to encourage this style of development for the project. === Community === Shindig seeks to develop developer and user communities during incubation. = Core Developers = The initial core developers are all Ning employees. We hope to expand this very quickly. = Alignment = The developers of Shindig want to work with the Apache Software Foundation specifically because Apache has proven to provide a strong foundation and set of practices for developing standards-based infrastructure and server components. = Known Risks = == Orphaned products == Shindig is new development of an emerging set of APIs. == Inexperience with Open Source == The initial developers include long-time open source developers, including Apache Members. == Homogenous Developers == The initial group of developers is quite homogenous. Remedying this is a large part of why we want to bring the project to Apache. == Reliance on Salaried Developers == The initial group of developers are employed by a potential consumer of the project. Remedying this is a large part of why we want to bring the project to Apache. == Relationships with Other Apache Products == None in particular, except that Apache HTTPD is the best place to run PHP, which the server-side components Ning intends to donate have been implemented in. == A Excessive Fascination with the Apache Brand == We believe in the processes, systems, and framework Apache has put in place. The brand is nice, but is not why we wish to come to Apache. = Documentation = Google's OpenSocial Documentation: http://code.google.com/apis/opensocial/ Ning's OpenSocial Documentation: http://tinyurl.com/3y5ckx = Initial Source = Ning, Inc. intends to donate code based on their implementation of OpenSocial. The backend systems will be replaced with more generic equivalents in order to not bind the implementation to specifics of the Ning platform. This code will be extracted from Ning's internal development, and has not been expanded on past the extraction. It will be provided primarily as a starting place for a much more robust, community- developed implementation. = External Dependencies = The initial codebase relies on a library created by Google, Inc., and licensed under the Apache Software License, Version 2.0. = Required Resources = Developer and user mailing lists A subversion repository A JIRA issue tracker = Initial Committers = Thomas Baker[EMAIL PROTECTED] Tim Williamson [EMAIL PROTECTED] Brian McCallister [EMAIL PROTECTED] Thomas Dudziak [EMAIL PROTECTED] Martin Traverso [EMAIL PROTECTED] = Sponsors = == Champion == Brian McCallister [EMAIL PROTECTED] == Nominated Mentors == Brian McCallister [EMAIL PROTECTED] Thomas Dudziak [EMAIL PROTECTED]
Re: Incubator Proposal: Pig
Big +1! :) Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Olga Natkovich [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Tuesday, September 18, 2007 3:52:23 PM Subject: Incubator Proposal: Pig Hi, Yahoo! research and development teams have developed a proposal below. The proposal is also available on wiki at http://wiki.apache.org/incubator/PigProposal http://wiki.apache.org/incubator/PigProposal. We would like to ask that the ASF consider forming a podling according to the proposal. Thanks, Olga Natkovich mailto:[EMAIL PROTECTED] [EMAIL PROTECTED] - = Pig Open Source Proposal = == Abstract == Pig is a platform for analyzing large data sets. == Proposal == The Pig project consists of high-level languages for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets. At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties: 1. ''Ease of programming''. It is trivial to achieve parallel execution of simple, embarrassingly parallel data analysis tasks. Complex tasks comprised of multiple interrelated data transformations are explicitly encoded as data flow sequences, making them easy to write, understand, and maintain. 2. ''Optimization opportunities''. The way in which tasks are encoded permits the system to optimize their execution automatically, allowing the user to focus on semantics rather than efficiency. 3. ''Extensibility''. Users can create their own functions to do special-purpose processing. == Background == Pig started as a research project at Yahoo! in May of 2006 to combine ideas in parallel databases and distributed computing. The first internal release took place in July 2006. The first release was a simple front-end to the Hadoop Map/Reduce framework. The following releases added new features and evolved the language based on user feedback. In July 2007, pig was taken over by a development team and the first production version is due to be released on 9/28/07. Since its inception, we had observed a steady growth of the user community within Yahoo!. In April 2007, Pig was released under a BSD-type license. Several external parties are using this version and have expressed interest in collaborating on its development. == Rationale == In an information-centric world, innovation is driven by ad-hoc analysis of large data sets. For example, search engine companies routinely deploy and refine services based on analyzing the recorded behavior of users, publishers, and advertisers. The rate of innovation depends on the efficiency with which data can be analyzed. To analyze large data sets efficiently, one needs parallelism. The cheapest and most scalable form of parallelism is cluster computing. Unfortunately, programming for a cluster computing environment is difficult and time-consuming. Pig makes it easy to harness the power of cluster computing for ad-hoc data analysis. While other language exist that try to achieve the same goals, we believe that Pig provides more flexibility and gives more control to the end user. SQL typically requires (1) importing data from a user's preferred format into a database system's internal format (2) well-structured, normalized data with a declared schema, and (3) programs expressed in declarative SELECT-FROM-WHERE blocks. In contrast, Pig Latin facilitates (1) interoperability, i.e. data may be read/written in a format accepted by other applications such as text editors or graph generators (2) flexibility, i.e. data may be loosely structured or have structure that is defined operationally, and (3) adoption by programmers who find procedural programming more natural than declarative programming. Sawzall is a scripting language used at Google on top of Map-Reduce. A sawzall program has a fairly rigid structure consisting of a filtering phase (the map step) followed by an aggregation phase (the reduce step). Furthermore, only the filtering phase can be written by the user, and only a pre-built set of aggregations are available (new ones are non-trivial to add). While Pig Latin has similar higher level primitives like filtering and aggregation, an arbitrary number of them can be flexibly chained together in a Pig Latin program, and all primitives can use user-defined functions with equal ease. Further, Pig Latin has additional primitives such as cogrouping, that allow
Re: Board Reports - Missing and Reviews
It's too late now, but it looks like Lucene.Net report is missing, too, no? George (from Lucene.net), what's new with Lucene.net? It looks like something happened in August, major activity - http://mail-archives.apache.org/mod_mbox/incubator-lucene-net-dev/ . Otis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Simpy -- http://www.simpy.com/ - Tag - Search - Share - Original Message From: Noel J. Bergman [EMAIL PROTECTED] To: general@incubator.apache.org Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED] Sent: Saturday, August 25, 2007 6:26:32 PM Subject: Board Reports - Missing and Reviews Due to the Board's schedule, there's been some extra time this month, but time's up. The Report (http://wiki.apache.org/incubator/August2007) needs to be completed, and PMC Members should review. Missing: lokahi, stdcxx, woden, wsrp4j. --- Noel - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [VOTE] Roller graduation
Finally! +1 Otis - Original Message From: Dave [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Tuesday, February 6, 2007 10:07:52 AM Subject: [VOTE] Roller graduation OK, let's try this again. The Roller community believes that Roller is ready for graduation, as evidenced by this vote: http://mail-archives.apache.org/mod_mbox/incubator-roller-dev/200702.mbox/browser We would like to initiate a vote to graduate to a top level project. We would like the resolution attached to this email to be presented to the board for consideration at the next possible board meeting. For additional information, the Roller status file is here: http://incubator.apache.org/projects/roller.html Thanks for your consideration. Please commence voting... - Dave Here is the resolution: https://svn.apache.org/repos/asf/incubator/roller/trunk/tlp-resolution.txt Establish the Apache Roller project WHEREAS, the Board of Directors deems it to be in the best interests of the Foundation and consistent with the Foundation's purpose to establish a Project Management Committee charged with the creation and maintenance of open-source software related to the Roller blog server, for distribution at no charge to the public. NOW, THEREFORE, BE IT RESOLVED, that a Project Management Committee (PMC), to be known as the Apache Roller Project, be and hereby is established pursuant to Bylaws of the Foundation; and be it further RESOLVED, that the Apache Roller Project be and hereby is responsible for the creation and maintenance of open-source software related to the Roller blog server; and be it further RESOLVED, that the office of Vice President, Roller be and hereby is created, the person holding such office to serve at the direction of the Board of Directors as the chair of the Apache Roller Project, and to have primary responsibility for management of the projects within the scope of responsibility of the Apache Roller Project; and be it further RESOLVED, that the persons listed immediately below be and hereby are appointed to serve as the initial members of the Apache Roller Project: * Anil Gangolli [EMAIL PROTECTED] * Allen Gilliland [EMAIL PROTECTED] * Dave Johnson[EMAIL PROTECTED] * Matt Raible [EMAIL PROTECTED] * Craig Russell [EMAIL PROTECTED] * Matthew Schmidt [EMAIL PROTECTED] * Elias Torres[EMAIL PROTECTED] * Henri Yandell [EMAIL PROTECTED] NOW, THEREFORE, BE IT FURTHER RESOLVED, that Dave Johnson be appointed to the office of Vice President, Roller, to serve in accordance with and subject to the direction of the Board of Directors and the Bylaws of the Foundation until death, resignation, retirement, removal or disqualification, or until a successor is appointed; and be it further RESOLVED, that the initial Apache Roller Project be and hereby is tasked with the creation of a set of bylaws intended to encourage open development and increased participation in the Roller Project; and be it further RESOLVED, that the initial Apache Roller Project be and hereby is tasked with the migration and rationalization of the Apache Incubator Roller podling; and be it further RESOLVED, that all responsibility pertaining to the Apache Incubator Roller podling encumbered upon the Apache Incubator PMC are hereafter discharged. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [VOTE] graduate Solr to Lucene
+1 Otis - Original Message From: Yonik Seeley [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Friday, January 12, 2007 11:19:31 AM Subject: [VOTE] graduate Solr to Lucene The Solr community has voted and believes Solr is ready for graduation from the Incubator and has met all incubation requirements, and the Lucene PMC has voted to accept Solr. The Solr podling is therefore requesting to graduate from the Incubator to become an Apache Lucene subproject. Please send in your +1/0/-1 to approve/abstain/disapprove. References: Lucene PMC Vote on it's private list: Message-ID: [EMAIL PROTECTED] Solr community vote: http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200701.mbox/[EMAIL PROTECTED] The project status for Solr is at: http://incubator.apache.org/projects/solr.html The Solr home page is at: http://incubator.apache.org/solr/ -Yonik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: New Name for UIMA Podling?
Hi, - Original Message From: Rodent of Unusual Size [EMAIL PROTECTED] -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Mads Toftum wrote: +1 - there seems to have started some sort of fascination with changing names where there is no need. In general I'm not really a fan of naming things so that it is impossible to guess what a project is (that's hard enough as it is already). How would UIMA be pronounced in languages other than English? OG: How is New York pronounced in languages other than English? I think pronouncing UIMA would follow the same pattern. My question is: how do you pronounce UIMA in English? I actually don't pronounce it the English way, I pronounce it the Croatian/phonetic way - U (u as in [oo]ze) I (i as in [i]diot) M (m as in [m]ama) A (as as in [a]nother). Otis Aside from that, I'm wary of tedious and uninspiring names, like 'log4j'. I'm also wary of retaining names for projects that have had an existence prior to Apache. One reason is possible IP issues, and another is confusion. If some commercial concern has a product based on UIMA and they say so.. do they mean Apache UIMA? Pre-Apache UIMA? If they adopt the Apache package, do we need to worry about brand issues? (Answer: yes.) This is a new set of IP attributes for this item. I seriously think it needs a new name.. and not least because it's coming from the company with probably the greatest investment in software IP on the planet. With all the (baseless) remarks about Apache becoming a BigCo shill and clearinghouse, I see contraindications for maintaining the BigCo name. Just MHO. - -- #kenP-)} Ken Coar, Sanagendamgagwedweinini http://Ken.Coar.Org/ Author, developer, opinionist http://Apache-Server.Com/ Millennium hand and shrimp! -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.4 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iQCVAwUBRUDcPJrNPMCpn3XdAQLWLgQAnvFImsyLzZnkhIcxOpIn9x/LMjmlMGOq kRObYnnZFHwU/367BnrcajZyb4ttgviRkfcAbZvAAYHp+FHK2LFRCZPnXijimDYk jV85hQ0x3cRcwKhquq43ZmNEE9pTmGIj8xqy9oZmw/klgfgCo6DOrPTvKRJne6r8 O5Kk/g3eBGg= =+Hf1 -END PGP SIGNATURE- - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: New Name for UIMA Podling?
+1 for UIMA, even if some other ones are cute. Keeping UIMA makes it easy (for me) to pull just relevant pages from Google, Technorati, Simpy, etc. instead of also pulling up pages about cute animals, islands, and so on. Otis - Original Message From: Thilo Goetz [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Wednesday, October 18, 2006 7:27:40 AM Subject: Re: New Name for UIMA Podling? If there's no reason for us to change the project name, then I for one would just like to keep the one we have. We have built some name recognition around UIMA already, and I hope the Ukrainian Institute of Modern Art will forgive us for usurping the #1 spot on Google ;-) --Thilo Mads Toftum wrote: On Mon, Oct 16, 2006 at 11:58:53PM +0200, Leo Simons wrote: Note UIMA is a fine name for an apache project. We have projects like +1 - there seems to have started some sort of fascination with changing names where there is no need. In general I'm not really a fan of naming things so that it is impossible to guess what a project is (that's hard enough as it is already). vh Mads Toftum - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [VOTE] Mark lucene4c as dormant
[X] +1 - Mark Lucene4c as dormant. [ ] 0 - I have no opinion. [ ] -1 - No, please keep it! [include reason] Otis - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [Vote] accept UIMA as a podling - #2
[X] +1 Accept UIMA as an Incubator podling [ ] 0 Don't care [ ] -1 Reject this proposal for the following reason: Otis - Original Message From: Ian Holsman [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Tuesday, September 26, 2006 7:17:37 PM Subject: [Vote] accept UIMA as a podling - #2 issues addressed in this release: 1. updated proposal included 2. The first paragraph explains it to a layperson 3. OASIS issue addressed [ ] +1 Accept UIMA as an Incubator podling [ ] 0 Don't care [ ] -1 Reject this proposal for the following reason: 8---Proposal--8-- Hello everyone - We are submitting this proposal to the community for a new project in the incubator, and look forward to starting to work with this community. This is a slightly modified and extended version of the proposal that has already been posted to [EMAIL PROTECTED] The whole mail thread can be found [http://www.nabble.com/Proposal-for-a-new-incubation- project%3A-Unstructured-Information-Management-Architecture---UIMA- tf2154324.html here]. If you don't feel like reading the whole thread, the main question that came up was: this is all very well, but what does it really '''do'''? Attempts to answer that question where made [http://www.nabble.com/Re%3A-Proposal-for-a-new-incubation- project%3A-Unstructured-Information-Management-Architecture---UIMA- p5986403.html here] and [http://www.nabble.com/Re%3A-Proposal-for-a- new-incubation-project%3A-Unstructured-Information-Management- Architecture---UIMA-p5987788.html here]. We have since worked some of these into the proposal itself. = Proposal for Incubation Project: Unstructured Information Management Architecture - UIMA = == Abstract == UIMA is a component framework for the analysis of unstructured content such as text, audio and video. It comprises an SDK and tooling for composing and running analytic components written in Java and C++. == Proposal: Unstructured Information Management Architecture framework == Unstructured Information Management applications are software systems that analyze large volumes of unstructured information in order to discover knowledge that is relevant to an end user. We propose UIMA, a framework and SDK for developing such applications. An example UIM application might ingest plain text and identify entities, such as persons, places, organizations; or relations, such as works-for or located-at. UIMA enables such an application to be decomposed into components, for example ''language identification'' - ''language specific segmentation'' - ''sentence boundary detection'' - ''entity detection (person/place names etc.)''. Each component must implement interfaces defined by the framework and must provide self-describing metadata via XML descriptor files. The framework manages these components and the data flow between them. Components are written in Java or C++; the data that flows between components is designed for efficient mapping between these languages. UIMA additionally provides capabilities to wrap components as network services, and can scale to very large volumes by replicating processing pipelines over a cluster of networked nodes. This framework has already attracted a following among government, commercial, and academic institutions who previously developed analysis algorithms, but were unable to easily build on each other's works, and who want to be able to evolve their applications by independently upgrading parts, as better technology becomes available. Applications built with this framework are being used with plain text, audio streams, and image/video streams, identifying entities and relations, converting speech to text, translating into different languages, and determining properties of images. The UIMA framework runs components in a flow, passing a common data object containing unstructured information (free text, audio, video, etc.) through the components. Each component examines the unstructured information and data added by other components, and adds data of its own. The framework mandates a standardized form of the data being passed, and a standardized form of the interfaces to the components. We propose a project to develop, implement, support and enhance this framework (and, over time, other implementations) that comply with the UIMA standard (which has been submitted for standardization work within [http://www.oasis-open.org OASIS]. Members of this community are encouraged to participate in that effort, as well; OASIS has an open approach to granting Technical Committee voting rights to members of OASIS, described here: http://www.oasis-open.org/ committees/process.php#2.4. The proposal includes both the framework, as well as tools to develop, describe, compose and deploy UIMA-based components and applications. The initial work will be based on the UIMA Version 2
Re: [VOTE] accept UIMA as a podling
Excellent, now that this is out of the way, I'm looking forward to an improved proposal, so we can vote on it. Perhaps, if Garrett doesn't mind, you may want to run the improved proposal by Garrett first, before sending a new [VOTE] email with inlined proposal to the list. Otis - Original Message From: Garrett Rooney [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Tuesday, September 19, 2006 7:09:34 PM Subject: Re: [VOTE] accept UIMA as a podling On 9/19/06, Ian Holsman [EMAIL PROTECTED] wrote: Personally I look at some of the enterprise java proposals and have no clue about them either as i don't track the SOA/WS specs that closely. Yes, and that's a BAD thing. If this proposal was for some j2ee/WS/SOA related monstrosity with 98 different acronyms in the first paragraph it would be getting exactly the same -1 from me. -garrett - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [VOTE] accept UIMA as a podling
Damn, and I was going to give it +1. UIMA folks answered questions about what it is that UIMA really does in emails, but yes, making sure it's answered in the proposal (I can't connect to wiki.apache.org at the moment to see the final proposal for myself). Otis - Original Message From: Garrett Rooney [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Monday, September 18, 2006 5:11:13 PM Subject: Re: [VOTE] accept UIMA as a podling On 9/18/06, Ian Holsman [EMAIL PROTECTED] wrote: [ ] +1 Accept UIMA as an Incubator podling [ ] 0 Don't care [X] -1 Reject this proposal for the following reason: I'm sorry, but I have to vote -1 based on my new policy of rejecting any potential podling that can't explain what it is that they do within the first paragraph of the proposal. I'm a fairly intelligent person, but honestly I have no clue what an architecture and software framework for creating, discovering, composing and deploying a broad range of multi-modal analysis capabilities actually is, and I see little potential for any project that's so bad at selling themselves to actually grow a useful community. Additionally, I believe we decided that having the final vote thread point to a Wiki page was a bad idea. It would be good to resend this with the actual proposal content inline so everyone can be sure what they're actually voting on. -garrett - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [PROPOSAL] UIMA (Unstructured Information Management Architecture) Framework
Having finally read all the emails related to this proposal, I'm very much for this puppy entering ASF and eventually getting it going with Lucene and friends. A few questions. 1. What you are proposing for ASF is the UIMA 2.0 code that currently lives on SF, correct? 2. What about the SDK, and could you tell me/us what's in the SDK that is not in the SF code? (I'm confused, because your proposal includes references to tools for development and design of UIMA components, but doesn't that typically live in an SDK?) 3. I'm a bit puzzled why something that sounds like a framework/pipeline for hooking up components with pre-defined input/output adapters ends up with with a 400 page user guide/book. Perhaps I should present this as a question. How come? Or is that user guide for the SDK only? Otis - Original Message From: Marshall Schor [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Saturday, September 9, 2006 8:00:57 AM Subject: [PROPOSAL] UIMA (Unstructured Information Management Architecture) Framework Hello everyone, I'm restarting this thread on the Unstructured Information Management Architecture implementation (UIMA) framework, in the hopes of moving this along better; this time it also has the prefix [PROPOSAL] which I had left out due to over-excitement at doing my first posting to this list :-) . Please consider this proposal (on the incubator wiki because it is quite long: http://wiki.apache.org/incubator/UimaProposal ), and help us move it along toward getting it voted on by the Incubator PMC. Two important clarifying emails (as well as the whole previous thread) can be found here: http://www.nabble.com/Re%3A-Proposal-for-a-new-incubation-project%3A-Unstructured-Information-Management-Architecture---UIMA-p5987788.html and http://www.nabble.com/Re%3A-Proposal-for-a-new-incubation-project%3A-Unstructured-Information-Management-Architecture---UIMA-p5986403.html (These are also hyperlinks in the wiki to these at the end of the first small section.) -Marshall Leo Simons wrote: On Fri, Aug 25, 2006 at 06:04:04PM +0200, Thilo Goetz wrote: snip/ I hope this gives you a better idea what UIMA is about Yep, this and other explanations made it a lot clearer, thanks! UIMA sounds ambituous and interesting. cheers, Leo Niclas Hedhman wrote: On Thursday 24 August 2006 03:21, Marshall Schor wrote: Proposal for Incubation Project: Unstructured Information Management Architecture - UIMA From going from WTF is this to Hmmm... interesting after Leo's brilliant please clarify (resusable as well) mail. I think this is an area that has plenty of potential, possibly with a lot of interested parties in academia at large, I think ASF could be a good community breeding ground. I'm in favour of this, but not capable of contributing in any form. Cheers Niclas Yonik Seeley wrote: On 8/26/06, Thilo Goetz [EMAIL PROTECTED] wrote: From an application perspective, we have great hopes for a cooperation with the Lucene project. Great, I think this is something I'd like to get involved in! I've been thinking about how Solr integration could work. You then also need a search engine that can index that extra information and make it available for search. Without getting into too much detail here, some info could be immediately usable by Lucene based apps (like entity extraction, where you can add info via a new field in the document). Parts-of-speech type of stuff is currently more difficult of course. -Yonik - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Adding Jeff Roderburg to Lucene.Net as a committer
+1 for Jeff, Lucene.Net needs him! ;) Otis - Original Message From: George Aroush [EMAIL PROTECTED] To: general@incubator.apache.org Cc: Erik Hatcher [EMAIL PROTECTED]; Doug Cutting [EMAIL PROTECTED]; Jeff Rodenburg [EMAIL PROTECTED] Sent: Wednesday, April 12, 2006 11:21:51 PM Subject: Adding Jeff Roderburg to Lucene.Net as a committer Hi folks, I am looking to add Jeff to Lucene.Net as a committer. Jeff is very active with Lucene.Net at SourceForge.net and I believe he will be a good addition to Lucene.Net This is what I found http://apache.planetmirror.com.au/dev/pmc.html in regards on how and what need to be done to add Jeff. Please advice if this not it. Jeff: To get you on board, please start here: http://apache.planetmirror.com.au/dev/new-committers-guide.html -- there are some paper work which you have to take care of. Erik, Doug: I don't know if we really need to vote on this, if so, my vote is +1. Regards, -- George - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [VOTE] accept Solr into incubator
This is as clear as day: +1. Otis - Original Message From: Doug Cutting [EMAIL PROTECTED] To: general@incubator.apache.org Sent: Tue 10 Jan 2006 12:21:49 PM EST Subject: [VOTE] accept Solr into incubator I propose that we accept the CNET's Solr project into the incubator. Discussion on this list evidenced broad interest in this project, which bodes well for its ability to build a developer community. The Lucene PMC would be happy to accept Solr as a Lucene sub-project once it graduates from the incubator. The proposal is at: http://wiki.apache.org/incubator/SolrProposal +1 Doug - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Derby page updated
Unfortunately, I think sites are not being (r)synced every 4 hours. I made changes to the Lucene web site several days ago (ssh to minotaur, svn up under /www/lucene). Maybe somebody on infrastructure will know what is happening. No rush for Lucene. Otis --- David Crossley [EMAIL PROTECTED] wrote: Garrett Rooney wrote: I believe a 'svn up' needs to be run by someone on minutaur (perhaps in /www/incubator.apache.org?), then you need to wait for the sync job to copy the results over to the current web server (ajax?). Thanks for clarifying. I just did that 'svn up' so now we need to wait for the rsync to ajax. This quarter's report to the board for Infrastucture shows that the rsync happens every 4 hours. --David - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Ruby Lucene port - directly under Lucene TLP?
Hello, If an existing TLP, such as Lucene, wants to develop a port, such as a Ruby port of the Lucene library, can the Lucene PMC invite that port and its developers under its wings directly, or does the port need to go through the Incubator? Note that this port is not an existing external project, but a brand new port that would be developed under Lucene from scratch by a group of 4-5 developers. Thanks, Otis - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]