from:"Otis Gospodnetic"

Re: [VOTE] Accept Drill into the Apache Incubator

2012-08-08 Thread Otis Gospodnetic

+1 (blinding)

Otis

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 




 From: Ted Dunning ted.dunn...@gmail.com
To: general@incubator.apache.org 
Sent: Tuesday, August 7, 2012 10:41 PM
Subject: [VOTE] Accept Drill into the Apache Incubator
 
I would like to call a vote for accepting Drill for incubation in the
Apache Incubator. The full proposal is available below.  Discussion
over the last few days has been quite positive.

Please cast your vote:

[ ] +1, bring Drill into Incubator
[ ] +0, I don't care either way,
[ ] -1, do not bring Drill into Incubator, because...

This vote will be open for 72 hours and only votes from the Incubator
PMC are binding.  The start of the vote is just before 3AM UTC on 8
August so the closing time will be 3AM UTC on 11 August.

Thank you for your consideration!

Ted

http://wiki.apache.org/incubator/DrillProposal

= Drill =

== Abstract ==
Drill is a distributed system for interactive analysis of large-scale
datasets, inspired by
[[http://research.google.com/pubs/pub36632.html|Google's Dremel]].

== Proposal ==
Drill is a distributed system for interactive analysis of large-scale
datasets. Drill is similar to Google's Dremel, with the additional
flexibility needed to support a broader range of query languages, data
formats and data sources. It is designed to efficiently process nested
data. It is a design goal to scale to 10,000 servers or more and to be
able to process petabyes of data and trillions of records in seconds.

== Background ==
Many organizations have the need to run data-intensive applications,
including batch processing, stream processing and interactive
analysis. In recent years open source systems have emerged to address
the need for scalable batch processing (Apache Hadoop) and stream
processing (Storm, Apache S4). In 2010 Google published a paper called
Dremel: Interactive Analysis of Web-Scale Datasets, describing a
scalable system used internally for interactive analysis of nested
data. No open source project has successfully replicated the
capabilities of Dremel.

== Rationale ==
There is a strong need in the market for low-latency interactive
analysis of large-scale datasets, including nested data (eg, JSON,
Avro, Protocol Buffers). This need was identified by Google and
addressed internally with a system called Dremel.

In recent years open source systems have emerged to address the need
for scalable batch processing (Apache Hadoop) and stream processing
(Storm, Apache S4). Apache Hadoop, originally inspired by Google's
internal MapReduce system, is used by thousands of organizations
processing large-scale datasets. Apache Hadoop is designed to achieve
very high throughput, but is not designed to achieve the sub-second
latency needed for interactive data analysis and exploration. Drill,
inspired by Google's internal Dremel system, is intended to address
this need.

It is worth noting that, as explained by Google in the original paper,
Dremel complements MapReduce-based computing. Dremel is not intended
as a replacement for MapReduce and is often used in conjunction with
it to analyze outputs of MapReduce pipelines or rapidly prototype
larger computations. Indeed, Dremel and MapReduce are both used by
thousands of Google employees.

Like Dremel, Drill supports a nested data model with data encoded in a
number of formats such as JSON, Avro or Protocol Buffers. In many
organizations nested data is the standard, so supporting a nested data
model eliminates the need to normalize the data. With that said, flat
data formats, such as CSV files, are naturally supported as a special
case of nested data.

The Drill architecture consists of four key components/layers:
* Query languages: This layer is responsible for parsing the user's
query and constructing an execution plan.  The initial goal is to
support the SQL-like language used by Dremel and
[[https://developers.google.com/bigquery/docs/query-reference|Google
BigQuery]], which we call DrQL. However, Drill is designed to support
other languages and programming models, such as the
[[http://www.mongodb.org/display/DOCS/Mongo+Query+Language|Mongo Query
Language]], [[http://www.cascading.org/|Cascading]] or
[[https://github.com/tdunning/Plume|Plume]].
* Low-latency distributed execution engine: This layer is responsible
for executing the physical plan. It provides the scalability and fault
tolerance needed to efficiently query petabytes of data on 10,000
servers. Drill's execution engine is based on research in distributed
execution engines (eg, Dremel, Dryad, Hyracks, CIEL, Stratosphere) and
columnar storage, and can be extended with additional operators and
connectors.
* Nested data formats: This layer is responsible for supporting
various data formats. The initial goal is to support the column-based
format used by Dremel. Drill is designed to support schema-based
formats such as Protocol Buffers/Dremel, Avro/AVRO-806/Trevni and CSV,

Re: [PROPOSAL] Drill for the Apache Incubator

2012-08-07 Thread Otis Gospodnetic

I concur with Andrzej.  Let's see that VOTE Ted!

Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 




 From: Andrzej Bialecki a...@getopt.org
To: general@incubator.apache.org 
Sent: Tuesday, August 7, 2012 5:51 PM
Subject: Re: [PROPOSAL] Drill for the Apache Incubator
 
On 07/08/2012 21:14, Franklin, Matthew B. wrote:
 -Original Message-
 From: Marvin Humphrey [mailto:mar...@rectangular.com]
 Sent: Monday, August 06, 2012 12:25 PM
 To: general@incubator.apache.org
 Cc: Grant Ingersoll; Isabel Drost
 Subject: Re: [PROPOSAL] Drill for the Apache Incubator

 On Thu, Aug 2, 2012 at 3:12 PM, Ted Dunning ted.dunn...@gmail.com
 wrote:

 Initial Source
 ==
 There is no initial source code. All source code will be developed within
 the Apache Incubator.

 Coming in without any source code is going to pose a challenge to this
 podling.

     http://www.apache.org/foundation/how-it-works.html#incubator

     The incubator filters projects on the basis of the likeliness of
 them becoming
     successful meritocratic communities. The basic requirements for 
incubation
     are:

         * a working codebase -- over the years and after several failures, 
the
           foundation came to understand that without an initial working
           codebase, it is generally hard to bootstrap a community. This is
           because merit is not well recognized by developers without a 
working
           codebase. Also, the friction that is developed during the initial
           design stage is likely to fragment the community.

 It seems like there could be flexibility in this requirement, based on a few 
 factors.  In this case, a design discussion has been ongoing; but I would 
 also think that any community coming in with enough people who know the 
 Apache way may also not need as much of a solid starting point code wise.

+1. Given the credentials and the experience of proposed committers and 
mentors, and the fact that the initial design is already done, I don't 
think this is a serious risk. And it's an exciting proposal with a 
potentially big impact.

-- 
Best regards,
Andrzej Bialecki
http://www.sigram.com, blog http://www.sigram.com/blog
  ___.,___,___,___,_._. __
[___||.__|__/|__||\/|: Information Retrieval, System Integration
___|||__||..\|..||..|: Contact: info at sigram dot com


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] accept DirectMemory as new Apache Incubator podling

2011-10-03 Thread Otis Gospodnetic

+1 (member)

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/




From: Simone Tripodi simonetrip...@apache.org
To: general@incubator.apache.org
Sent: Sunday, October 2, 2011 3:36 AM
Subject: [VOTE] accept DirectMemory as new Apache Incubator podling

Hi all guys,

I'm now calling a formal VOTE on the DirectMemory proposal located here:

http://wiki.apache.org/incubator/DirectMemoryProposal

Proposal text copied at the bottom of this email.

VOTE close on Tuesday, October 4, early 7:30 AM CET.

Please VOTE:

[ ] +1 Accept DirectMemory into the Apache Incubator
[ ] +0 Don't care
[ ] -1  Don't Accept DirectMemory into the Apache Incubator because...

Thanks in advance for participating!

All the best, have a nice day,
Simo

P.S. Here's my +1

http://people.apache.org/~simonetripodi/
http://www.99soft.org/

= DirectMemory =

== Abstract ==
The following proposal is about Apache !DirectMemory, a Java
!OpenSource multi-layered cache implementation featuring off-heap
memory storage (a-la Terracotta !BigMemory) to enable caching of Java
objects without degrading JVM performance

== Proposal ==
!DirectMemory's main purpose is to to act as a second level cache
(after a heap based one) able to store large amounts of data without
filling up the Java heap and thus avoiding long garbage collection
cycles. Although serialization has a runtime cost store/retrieve
operations are in the sub-millisecond range being pretty acceptable in
every usage scenario even as a first level cache and, most of all,
outperforms heap storage when the count of the entries goes over a
certain amount. !DirectMemory implements cache eviction based on a
simple LFU (Least Frequently Used) algorythm and also on item
expiration. Included in the box is a small set of utility classes to
easily handle off-heap memory buffers.

== Background ==
!DirectMemory is a project was born in the 2010 thanks to Raffaele P.
Guidi initial effort under
[[https://github.com/raffaeleguidi/!DirectMemory/|GitHub]] and already
licensed under the Apache License 2.0.

== Rationale ==
The rationale behind !DirectMemory is bringing off-heap caching to the
open source world, empowering FOSS developers and products with a tool
that enables breaking the heap barrier and override the JVM garbage
collection mechanism collection - which could be useful in scenarios
where RAM needs are over the usual limits (more than 8, 12, 24gb) and
to ease usage of off-heap memory in general

= Current Status =

== Meritocracy ==
As a majority of the initial project members are existing ASF
committers, we recognize the desirability of running the project as a
meritocracy.  We are eager to engage other members of the community
and operate to the standard of meritocracy that Apache emphasizes; we
believe this is the most effective method of growing our community and
enabling widespread adoption.

== Core Developers ==
In alphabetical order:

* Christian Grobmeier grobmeier at apache dot org
* Maurizio Cucchiara mcucchiara at apache dot org
* Olivier Lamy olamy at apache dot org
* Raffaele P. Guidi raffaele dot p dot guidi at gmail dot com
* Simone Gianni simoneg at apache dot org
* Simone Tripodi simonetripodi at apache dot org
* Tommaso Teofili tommaso at apache dot org

== Alignment ==
The purpose of the project is to develop and maintain !DirectMemory
implementation that can be used by other Apache projects.

= Known Risks =
== Orphaned Products ==
!DirectMemory does not have any reported production usage, yet, but is
getting traction with developers and being evaluated by potential
users and thus the risks of it being orphaned are minimal

== Inexperience with Open Source ==
All of the committers have experience working in one or more open
source projects inside and outside ASF.

== Homogeneous Developers ==
The list of initial committers are geographically distributed across
the Europe with no one company being associated with a majority of the
developers.  Many of these initial developers are experienced Apache
committers already and all are experienced with working in distributed
development communities.

== Reliance on Salaried Developers ==
To the best of our knowledge, none of the initial committers are being
paid to develop code for this project.

== Relationships with Other Apache Products ==
!DirectMemory fits naturally in the ASF because it could be
successfully employed together with a large number of ASF products
ranging from JCS - as a new cache region between the heap and indexed
file ones, to ORM systems like Cayenne (i.e. replacing current OSCache
based implementation), Apache JDO and JPA implementations and also
java based databases (i.e. Derby) and all systems managing large
amounts of data from Hadoop to Cassandra

== A Excessive Fascination with the Apache Brand ==
While the Apache Software Foundation would be a good home for the
!DirectMemory project it already has

Re: [VOTE] S4 to join the Incubator

2011-09-20 Thread Otis Gospodnetic

+1

Otis


Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



From: Patrick Hunt ph...@apache.org
To: general@incubator.apache.org
Sent: Tuesday, September 20, 2011 4:56 PM
Subject: [VOTE] S4 to join the Incubator

It's been a nearly a week since the S4 proposal was submitted for
discussion.  A few questions were asked, and the proposal was clarified
in response.  Sufficient mentors have volunteered.  I thus feel we are
now ready for a vote.

The latest proposal can be found at the end of this email and at:

http://wiki.apache.org/incubator/S4Proposal

The discussion regarding the proposal can be found at:

http://s.apache.org/RMU

Please cast your votes:

[  ] +1 Accept S4 for incubation
[  ] +0 Indifferent to S4 incubation
[  ] -1 Reject S4 for incubation

This vote will close 72 hours from now.

Thanks,

Patrick

--
= S4 Proposal =

== Abstract ==

S4 (Simple Scalable Streaming System) is a general-purpose,
distributed, scalable, partially fault-tolerant, pluggable platform
that allows programmers to easily develop applications for processing
continuous, unbounded streams of data.

== Proposal ==

S4 is a software platform written in Java. Clients that send and
receive events can be written in any programming language. S4 also
includes a collection of modules called Processing Elements (or PEs
for short) that implement basic functionality and can be used by
application developers. In S4, keyed data events are routed with
affinity to Processing Elements (PEs), which consume the events and do
one or both of the following: (1) ''emit'' one or more events which
may be consumed by other PEs, (2) ''publish'' results. The
architecture resembles the Actors model, providing semantics of
encapsulation and location transparency, thus allowing applications to
be massively concurrent while exposing a simple programming  interface
to application developers.

To drive adoption and increase the number of contributors to the
project, we may need to prioritize the focus based on feedback from
the community. We believe that one of the top priorities and driving
design principle for the S4 project is to provide a simple API that
hides most of the complexity associated with distributed systems and
concurrency. The project grew out of the need to provide a flexible
platform for application developers and scientists that can be used
for quick experimentation and production.

S4 differs from existing Apache projects in a number of fundamental
ways. Flume is an Incubator project that focuses on log processing,
performing lightweight processing in a distributed fashion and
accumulating log data in a centralized repository for batch
processing. S4 instead performs all stream processing in a distributed
fashion and enables applications to form arbitrary graphs to process
streams of events. We see Flume as a complementary project. We also
expect S4 to complement Hadoop processing and in some cases to
supersede it. Kafka is another Incubator project that focuses on
processing large amounts of stream data. The design of Kafka, however,
follows the pub-sub paradigm, which focuses on delivering messages
containing arbitrary data from source processes (publishers) to
consumer processes (subscribers). Compared to S4, Kafka is an
intermediate step between data generation and processing, while S4 is
itself a platform for processing streams of events.

S4 overall addresses a need of existing applications to process
streams of events beyond moving data to a centralized repository for
batch processing. It complements the features of existing Apache
projects, such as Hadoop, Flume, and Kafka, by providing a flexible
platform for distributed event processing.

== Background ==

S4 was initially developed at Yahoo! Labs starting in 2008 to process
user feedback in the context of search advertising. The project was
licensed under the Apache License version 2.0 in October 2010. The
project documentation is currently available at http://s4.io .

== Rationale ==

Stream computing has been growing steadily over the last 20 years.
However, recently there has been an explosion in real-time data
sources including the Web, sensor networks, financial securities
analysis and trading, traffic monitoring, natural language processing
of news and social data, and much more.

As Hadoop evolved as a standard open source solution for batch
processing of massive data sets, there is no equivalent community
supported open source platform for processing data streams in
real-time. While various research projects have evolved into
proprietary commercial products, S4 has the potential to fill the gap.
Many projects that require a scalable stream processing architecture
currently use Hadoop by segmenting the input stream into data batches.
This solution is not efficient, results in high latency, and
introduces unnecessary complexity.

The S4 design is

Re: [DISCUSS] DirectMemory to join the Apache Incubator

2011-09-20 Thread Otis Gospodnetic

Oh I hope this gets into ASF.

I mentioned DirectMemory on one of the Apache MLs the other day in the context 
of somebody at Facebook? or Cloudera? working on something very similar for 
another ASF project maybe HBase.  Just mentioning it.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



From: Simone Tripodi simonetrip...@apache.org
To: general@incubator.apache.org
Sent: Tuesday, September 20, 2011 5:48 AM
Subject: [DISCUSS] DirectMemory to join the Apache Incubator

Hi all guys,
I would like to propose DirectMemory, a Java OpenSource multi-layered
cache implementation featuring off-heap memory storage (a-la
Terracotta BigMemory) originally developed by Raffaele P. Guidi on
GitHub[1], to be an Apache Incubator project. For those interested on
knowing more about DirectMemory, you can read Raffaele's related
blog[2].

Here's a link to the proposal in the Incubator wiki[3] where we
started collecting all needed info.

As you will note, the list of mentors is in need of some volunteers,
so if you find this interesting, feel free to sign up or let us know
you are interested :).

Hope to read from you soon, thanks in advance and have a nice day!
All the best,
Simo

[1] https://github.com/raffaeleguidi/DirectMemory
[2] http://raffaeleguidi.wordpress.com/
[3] http://wiki.apache.org/incubator/DirectMemoryProposal

http://people.apache.org/~simonetripodi/
http://www.99soft.org/

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [PROPOSAL] Flume for the Apache Incubator

2011-05-27 Thread Otis Gospodnetic

Looks good to me, Jon.  We've contributed to Flume before and plan on making at 
least a few more contributions in the near future.  I look forward to doing 
that 
under ASF.

Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Jonathan Hsieh j...@cloudera.com
 To: general@incubator.apache.org
 Sent: Fri, May 27, 2011 10:18:33 AM
 Subject: [PROPOSAL] Flume for the Apache Incubator
 
 Howdy!
 
 I would like to propose Flume to be an Apache Incubator  project.  Flume is a
 distributed, reliable, and available system for  efficiently collecting,
 aggregating, and moving large amounts of log data to  scalable data storage
 systems such as Apache Hadoop's HDFS.
 
 Here's a  link to the proposal in the Incubator wiki
 http://wiki.apache.org/incubator/FlumeProposal
 
 I've also pasted the  initial contents below.
 
 Thanks!
 Jon.
 
 = Flume - A Distributed  Log Collection System =
 
 == Abstract ==
 
 Flume is a distributed,  reliable, and available system for efficiently
 collecting, aggregating, and  moving large amounts of log data to scalable
 data storage systems such as  Apache Hadoop's HDFS.
 
 == Proposal ==
 
 Flume is a distributed,  reliable, and available system for efficiently
 collecting, aggregating, and  moving large amounts of log data from many
 different sources to a centralized  data store. Its main goal is to deliver
 data from applications to Hadoop’s  HDFS.  It has a simple and flexible
 architecture for transporting  streaming event data via flume nodes to the
 data store.  It is robust  and fault-tolerant with tunable reliability
 mechanisms that rely upon many  failover and recovery mechanisms. The system
 is centrally configured and  allows for intelligent dynamic management. It
 uses a simple extensible data  model that allows for lightweight online
 analytic applications.  It  provides a pluggable mechanism by which new
 sources, destinations, and  analytic functions which can be integrated within
 a Flume pipeline.
 
 ==  Background ==
 
 Flume was initially developed by Cloudera to enable  reliable and simplified
 collection of log information from many distributed  sources. It was later
 open-sourced by Cloudera on GitHub as an Apache 2.0  licensed project in June
 2010. During this time Flume has been formally  released five times as
 versions 0.9.0 (June 2010), 0.9.1 (Aug 2010), 0.9.1u1  (Oct 2010), 0.9.2 (Nov
 2010), and 0.9.3 (Feb 2011).  These releases are  also distributed by
 Cloudera as source and binaries along with enhancements  as part of Cloudera
 Distribution including Apache Hadoop (CDH).
 
 ==  Rationale ==
 
 Collecting log information in a data center in a timely,  reliable, and
 efficient manner is a difficult challenge but important because  when
 aggregated and analyzed, log information can yield valuable  business
 insights.   We believe that users and operators need a  manageable systematic
 approach for log collection that simplifies the  creation, the monitoring,
 and the administration of reliable log data  pipelines.  Oftentimes today,
 this collection is attempted by  periodically shipping data in batches and by
 using potentially unreliable and  inefficient ad-hoc methods.
 
 Log data is typically generated in various  systems running within a data
 center that can range from a few machines to  hundreds of machines.  In
 aggregate, the data acts like a large-volume  continuous stream with contents
 that can have highly-varied format and  highly-varied content.  The volume
 and variety of raw log data makes  Apache Hadoop's HDFS file system an ideal
 storage location before the  eventual analysis.  Unfortunately, HDFS has
 limitations with regards to  durability as well as scaling limitations when
 handling a large number of  low-bandwidth connections or small files.
  Similar technical challenges are  also suffered when attempting to write
 data to other data storage  services.
 
 Flume addresses these challenges by providing a reliable,  scalable,
 manageable, and extensible solution.  It uses a streaming  design for
 capturing and aggregating log information from varied sources in  a
 distributed environment and has centralized management features for  minimal
 configuration and management overhead.
 
 == Initial Goals  ==
 
 Flume is currently in its first major release with a considerable  number of
 enhancement requests, tasks, and issues recorded towards its  future
 development. The initial goal of this project will be to continue to  build
 community in the spirit of the Apache Way, and to address the  highly
 requested features and bug-fixes towards the next dot  release.
 
 Some goals include:
 * To stand up a sustaining Apache-based  community around the Flume codebase.
 * Implementing core functionality of a  usable highly-available Flume master.
 * Performance, usability, and  robustness improvements.
 * Improving the ability to

Re: Name change from Lucene Connectors Framework to Apache Connectors Framework

2010-09-02 Thread Otis Gospodnetic

Here's a non-abstract one:

- Apache Data (Source?) Connectors?

Perhaps Data (Source) would make it clear what this is about.

Otis




- Original Message 
 From: Benson Margulies bimargul...@gmail.com
 To: general@incubator.apache.org
 Sent: Mon, August 30, 2010 1:23:35 PM
 Subject: Re: Name change from Lucene Connectors Framework to Apache 
Connectors Framework
 
 It seems to me that the pivotal problem here is the word connector. On
 the  one hand, it could mean almost anything to almost anyone. On the
 other hand,  it has a specific denotation in the vicinity of httpd.
 Everything at Apache  is in the vicinity of httpd.
 
 I'd offer the following 'made-up' options,  all following Apache:
 
   - manifold  (many  connections)
   - omnivore (eats anything)
   - rapunzel (spins  straw into gold)
   - diogenes (seeking for something)
   -  lantern (ditto)
   - helium (fuel for Solr)
 
 The whole question of  brand management strikes me as interesting: is
 it, in fact, the job of the  incubator PMC to groom the Apache branding
 portfolio by guiding new projects  towards better names? Is that in our
 charter, or should we, as Chris  suggests, defer to someone else for
 problems in this area.
 
 
 On Mon,  Aug 30, 2010 at 1:12 PM, Mattmann, Chris A (388J)
 chris.a.mattm...@jpl.nasa.gov  wrote:
  Guys,
 
  If I may: since we're discussing marks,  why not post to trademarks@ and 
  ask 
Shane and crew to weigh in? Maybe you have  already, but if so, I haven't seen 
that discussion mentioned over here on  gene...@incubator.
 
  Thanks!
 
  Cheers,
   Chris
 
 
  On 8/30/10 10:03 AM, Grant Ignersoll gsing...@apache.org  wrote:
 
 
 
  On Aug 27, 2010, at 12:15 PM, David  Jencks wrote:
  To try to illustrate my thinking rather than push a  name down your 
throat...
  Open  ConnectorFramework/OpenConnectorFramework/OpenCF  OK, since you've 
added a  branding word.  Not ideal since the purpose appears overly broad
   Content Connector Framework/ContentConnectorFramework/CCF OK, since 
  you've  
clarified the scope.  Not ideal since has no branding word.
   OpenContentConnectorFramework/OpenCCF better since it clarifies the scope 
and  includes a branding word.
 
  So, the word open somehow alleviates  your concern?  I don't get that.  
  If 
your objection is that it comes across as  being _the_ Apache connector 
library, 
then how does Open modulate that?  It's  still the Apache Open Connector 
Framework.  It's still descriptive and still  implies it's the one.  Besides, 
it's the ASF, isn't Open implied/redundant?   We would never have the Apache 
Closed Connector Framework,  right?
 
  Likewise, the word Content implies the same only  status, albeit here I 
will give you that it distinguishes it from Tomcat  Connector somewhat, 
although 
the Tomcat Connector is just that, the Tomcat  connector.  However, I still 
don't buy that it is a branding word.  Content is  pretty much meaningless. 
 Everything is content.  I have no doubt that we could  write a plugin for ACF 
that connected to Tomcat and got Content out of it.   Heck, we already do. 
It's 
called a web crawler.
 
  So, that leaves  us, in my mind w/ the option of some made up name or we 
stick w/ ACF.  I'm all  for a made up one if someone comes up with one, I just 
don't know what it is and  no one in the community seems to have one either. 
 ACF fits and the community  likes it.  It's not unprecedented at the ASF and 
 I 
don't think it is confusing  with Tomcat Connector.
 
  At any rate, the community would like  some resolution.  Should I just call 
an official vote on ACF and if it loses  then we will go back to the drawing 
board?
 
  -Grant
   -
  To  unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
   For additional commands, e-mail: general-h...@incubator.apache.org
 
 
 
 
   ++
   Chris Mattmann, Ph.D.
  Senior Computer Scientist
  NASA Jet  Propulsion Laboratory Pasadena, CA 91109 USA
  Office: 171-266B, Mailstop:  171-246
  Email: chris.mattm...@jpl.nasa.gov
   WWW:   http://sunset.usc.edu/~mattmann/
   ++
   Adjunct Assistant Professor, Computer Science Department
  University of  Southern California, Los Angeles, CA 90089 USA
   ++
 
 
 
 -
 To  unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For  additional commands, e-mail: general-h...@incubator.apache.org
 
 

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Move Lucy to the Incubator

2010-07-20 Thread Otis Gospodnetic

+1

- Original Message 
 From: Chris Hostetter hossman_incuba...@fucit.org
 To: general@incubator.apache.org
 Sent: Sat, July 17, 2010 6:23:09 PM
 Subject: [VOTE] Move Lucy to the Incubator

 I would like to call a vote for accepting Apache Lucy for incubation in  
 the 
Apache Incubator. The full proposal is available below.  We ask the  Incubator 
PMC to sponsor it, with myself (hossman) as Champion, and mattmann,  
upayavira, 
mikemccand, and hossman volunteering to be Mentors.

 Please  cast your vote:

 [ ] +1, bring Lucy into Incubator
 [ ] +0, I don't care  either way,
 [ ] -1, do not bring Lucy into Incubator, because...

 This  vote will be open for 72 hours and only votes from the Incubator PMC 
 are  
binding.

 http://wiki.apache.org/incubator/LucyProposal

 PREFACE
  Lucy is a sub-project which is being spun off from the Lucene TLP but  is
 not yet ready for graduation.  We propose to address  certain needs of the
 project by transitioning to an Incubator  Podling, and assimilating the
 KinoSearch  codebase.

 ABSTRACT
 Lucy will be a loose port of the  Lucene search engine library, written in
 C and targeted at  dynamic language users.

 PROPOSAL
 Lucy has two  aims.  First, it will be a high-performance C search engine
  library.  Second, it will maximize its usability and power when  accessed
 via dynamic language bindings.  To that end, it  will present highly
 idiomatic, carefully tailored APIs for each  of its host binding
 languages, including support for  subclasses written entirely in the
 host  language.

 BACKGROUND
 Lucy, a loose C port of Java  Lucene, began as an ambitious,
 from-scratch Lucene sub-project,  with David Balmain (author of Ferret, a
 Ruby/C port of Lucene),  Doug Cutting, and Marvin Humphrey (founder of
 KinoSearch, a  Perl/C port) as committers.  During an initial burst of
  activity, the overall architecture for Lucy was sketched out by Dave  and
 Marvin.  Unfortunately, Dave became unavailable soon  after, and without a
 working codebase to release or any users,  it proved difficult to replace
 him.  Still, Marvin carried  on their work throughout a period of
 seemingly low  activity.

 In the last year, that work has come to fruition:  major technical
 milestones have been achieved and Lucy's  underpinnings have been
 completed.  Additionally, other  developers from the KinoSearch community
 have taken an interest  in Lucy and have begun to ramp up their
 contributions.   The next steps for Lucy were articulated by the Lucene
 PMC in a  recent review: make releases, acquire users, grow community.

  To implement the Lucene PMC's recommendations and get to a release  as
 quickly as possible, the Lucy community proposes to  assimilate the
 KinoSearch codebase, which has been retrofitted  to use Lucy's core.  Lucy
 still lacks a number of  important indexing and search classes; we wish to
 flesh these  out via IP clearance work rather than software development.

  Because Lucene is working to move away from being an umbrella  project,
 a long term goal of the Lucy project is to graduate  to an ASF TLP.  With
 that in mind, it seems more  appropriate for the KinoSearch software grant
 to take place  within the context of the Incubator, and that a Lucy
 podling  and PPMC be established which will ultimately take responsibility
  for the codebase.

 RATIONALE
 There is great hunger  for a search engine library in the mode of Lucene
 which is  accessible from various dynamic languages, and for one
  accessible from pure C.  Individuals naturally wish to code in  their
 language of choice.  Organizations which do not have  significant Java
 expertise may not want to support Java  strictly for the sake of running a
 Lucene installation.   Developers may want to take advantage of C's
 interoperability  and fine-grained control.  Lucy will meet all these
  demands.

 Apache is a natural home for our project given the  way it has always
 operated: user-driven innovation, security as  a requirement, lively and
 amiable mailing list discussions,  strength through diversity, and so on.
 We feel comfortable  here, and we believe that we will become exemplary
 Apache  citizens.

 INITIAL GOALS
 * Make a 1.0 stable release as  quickly as possible.
 * Concentrate on community  expansion.
 * Expose a public C API.

 CURRENT  STATUS
   Meritocracy
 Our initial committer list  includes two individuals (Peter Karman and
 Nathan Kurz) who  started off as KinoSearch users, demonstrated merit
 through  constructive forum participation, adept negotiation, consensus
  building, and submission of high-quality contributions, and were  invited
 to become committers.  Peter now rolls most

Re: [PROPOSAL] jSpirit Project

2010-07-20 Thread Otis Gospodnetic

Grégoire, no attachment.  ML software doesn't like it.  I suggest you put it on 
SF.

Otis




- Original Message 
 From: Grégoire Rolland grolland.jspi...@gmail.com
 To: general@incubator.apache.org
 Sent: Mon, July 19, 2010 4:35:54 AM
 Subject: Re: [PROPOSAL] jSpirit Project
 
 Hello,
 
 I append the proposal with several answers about SaaS,  Multi-tenancy, 
standards respect and what is really jSpirit.
 
 Thanks to  Otis for the feedback.
 
 Don't hesitate to send questions, feedback and new  ideas, we want to build 
this project with anyone is interested in the  community.
 
 Best Regards,
 
 Grégoire
 
 
 On 16/07/2010 16:57,  Grégoire Rolland wrote:
  Hello,
  
  I'm here to propose a  new project for the Apache incubator, related to a 
previous post I write  here.
  
  You can find the first draft of the proposal here [ 
http://wiki.apache.org/incubator/JSpiritProposal ].
  
  We are  looking for Champion, Mentors and interrested developpers. And we 
gracefully ask  to the Incubator for sponsoring this project.
  
  We were happy to  receive your feedback about this proposal.
  
  Thanks for your  support, we are happy to begin a new work with you !
  
  Best  Regards,
  
  Here the text of the proposal :
  
  =  Abstract =
  jSpirit will be a platform to develop efficiently enterprise  class 
applications for SaaS with real Multi-tenant support and cloud  deployement.
  
  = Proposal =
  jSpirit will provide  technical foundation on which application developper 
will create enterprise  software distributed as services. jSpirit vill 
implement 
global and out-of-box  architecture supporting multi-tenancy. As 
multi-tenancy, 
I mean architecture  that share the same application for multiple client, with 
support of specifics  comportements. The technical foundation will include an 
integration framework  designed for simplify and abstract technical complexity 
of J2EE for the final  developper, a set of tools to industrialize production 
of 
applications, a  complete applications stack, and a set of methods and 
recommandation to develop  efficiently.
  
  = Background =
  jSpirit was initialy  developped for a french company who wants to create a 
multi-tenant SaaS ERP for  trading in the agribusiness world. The application 
is 
now finnished and this  company opens the codes of the foundation of this 
project.  At this time,  there is no foundation framework whose provide 
multi-tenancy support so it was a  need to develop something like jSpirit. The 
experience of developping such  application point there is a need to have 
tools 
and method to do this.
  
  = Rationale =
  I think there is a strong need of architecture  and simplicity in the java 
world. The multi-tenancy problems are difficult to  resolve and the needs of 
such application will grow in the future. jSpirit will  implements out-of-box 
architecure, a seamless programming model and technical  module to simplify 
developpement. jSpirit goals is to become a concentrate of  experience of 
open-source and advanced J2EE developpers to provide a platform  for 
efficiently 
develop application in the SaaS and Multi-tenant world.
  
  = Initial Goals =
  First goal is to develop users and developer  community around the project 
  to 
ensure quality and usability of the platform.  Our open-source experience is 
not 
high so we think it's important to relies on a  community to make the project 
live.
  Second goal is to document the  project to be more usable as is.
  Third goal is to enlarge functionnality  and make the project more coherent 
with apache ecosystem.
  
  =  Current Status =
  == Code Base ==
  All the code base is here :  
[[http://sourceforge.net/projects/jspirit/|Sourceforge]].
  The current  code base implements all functionnalities below.
  
  ===  Architecture ===
   * Multi-tiered Architecture out-of-the-box :  Implementation of 
  Integration 
Layer, Business Layer, Client Layer
* Java 5 annotation and auto-injection based lookup of services
   *  Classpath scanning for auto-discovering components
   * Modular and  plugable architecture : automatic activation of modules in 
the classpath, ready  for seamless integration
   * Implementation of Long-Conversation  pattern, with JTA 2PC support (with 
Geronimo Transaction Manager), and implicit  demarcation (explicit demarcation 
is always possible)
   * [in  progress] AOP interceptor on top of each layer
  
  === Integration  Layer ===
   * Implementation of abstract integration services and  abstract persister 
based on JPA technology
   * Maven plugins for  code generation of integration layer from xml 
description of component business  model : generate persistent class, access 
services, queries, constraints, JPA  annotation,  lucene indexation of 
business 
model
   * bean  validation integration
   * Full Multi-tenancy integration on  EntityManager and Caches
   * Multi-tenant Postgresql support
  
  === Business Layer ===
   * Implementation of

Re: New project proposal

2010-07-13 Thread Otis Gospodnetic

Grégoire,

Could you please point us/me to some information about jSpirit funcitonality 
that is SaaS-specific?
Understanding that may help people figure out what jSpirit brings and does.
For example, if I use jSpirit, which SaaS-specific functionality does a 
developer not have to develop?  What functionality comes out of the box? etc.

Thanks,
Otis

Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



- Original Message 
 From: Grégoire Rolland grolland.jspi...@gmail.com
 To: general@incubator.apache.org
 Sent: Tue, July 13, 2010 4:11:52 AM
 Subject: New project proposal
 
 Hello,
 
 I'm the project leader of an open-source project called  jSpirit.
 
 The goal of the project is to create an open-source platform to  develop 
efficiently enterprise class lightweight J2EE application for SaaS with  
Multi-tenant support. The code is available here 
(http://sourceforge.net/projects/jspirit/). The platform focuses on the  
technical aspect of SaaS and Multi-tenant.
 
 I would my project to pretend  becoming an Apache Incubator project, and I 
 need 
help to do this. I think this  kind of platform could interest a large 
community. The goals are to provide  open-source application stack (focuses on 
apache project), tools to develop  efficiently, an architectural model for 
enterprise class application, methods  for project management, and an 
integration framework for rescuing application  developper from J2EE and 
multi-tenant complexity.
 
 The project is already  used by a french company as a foundation of her ERP 
(Husson Ingenierie, http://husson-info.fr), it's the  base of the community 
yet. 
I want to develop my professionnal activity around  this project, so it's 
perennial project, I think.
 
 Is there anyone  intersted by this project ?
 
 Best Regards,
 
 -- Grégoire  Rolland
 Projet *jSpirit*
 Tel : (+33) (0) 6 82 77 59 94
 mailto:grolland.jspi...@gmail.com


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [Proposal] JPPF : a parallel processing framework for Java

2010-01-13 Thread Otis Gospodnetic

Was my thinking, too.  How long before dc.apache.org (or some variant of it) is 
formed?

 Otis
--
Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch



- Original Message 
 From: Grant Ingersoll gsing...@apache.org
 To: general@incubator.apache.org
 Sent: Tue, January 12, 2010 3:50:52 PM
 Subject: Re: [Proposal] JPPF :  a parallel processing framework for Java
 
 
 On Jan 12, 2010, at 1:47 PM, Alan D. Cabrera wrote:
 
  
  On Jan 12, 2010, at 7:27 AM, Grant Ingersoll wrote:
  
  
  On Jan 12, 2010, at 10:12 AM, Emmanuel LŽcharny wrote:
  
  Grant Ingersoll a écrit :
  Seems like this might fit nicely with Hadoop.  Has anyone approached 
  their 
 PMC about sponsoring?
  
  No, not yet, but that's clearly an option. At least, a better fit than 
  MINA, 
 IMO.
  
  Let's do that.
  
  Yeah, Hadoop isn't just about Map-Reduce.
  
  Just curious, if there's no Hadoop tech in the project then why have it 
 sponsored by Hadoop?
  
 
 I'd add, I think most in Hadoop land view Hadoop as one of the primary places 
 for large scale distributed computing at Apache.  Map Reduce is one approach 
 and 
 it does not fit all situations, so I think you'll see other things arise 
 there, 
 possibly JPPF. 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Incubate Lucene Connector Framework

2010-01-08 Thread Otis Gospodnetic

 herein described is accepted. MetaCarta patents are not infringed
 by this grant. Also, MetaCarta trademarks are not included in this grant.
 
 External Dependencies
 
 The project dependencies, other than on other Apache projects, are as follows:
 
 The ConnectorFramework core currently uses the Bitmechanic JDBC pool driver,
 which is BSD licensed, and the Postgresql JDBC driver, which is also BSD
 licensed.
 
 The LiveLink Connector relies on LAPI, which is privately licensed by 
 OpenText.
 The Documentum Connector relies on DFC, which is privately licensed by EMC. 
 The
 Share Connector relies on jCIFS, which is LGPL. The Memex Connector relies on
 privately licensed java libraries from Memex. The FileNet Connector relies on
 privately licensed java libraries from IBM.
 
 Required Resources
 
 • Mailing lists • connectors-private (with moderated subscriptions) • 
 connectors-user@ • connectors-dev@ • connectors-commit@ • Subversion 
 directory • 
 https://svn.apache.org/repos/asf/incubator/connectors  
 
 • Website • Confluence (CONNECTORS) • Issue Tracking • JIRA (CONNECTORS)
 
 Initial Committers
 
 Names of initial committers with affiliation and current ASF status:
 
 • Karl Wright (kwright at metacarta) • Josiah Strandberg (jstrandberg at 
 metacarta) • Ken Baker (bakerkj at metacarta) • Marc Meadows (mam at 
 metacarta) 
 • Grant Ingersoll ( gsing...@a.o Lucid Imagination, ASF Member)
 
 • Brian Pinkerton (brian.pinkerton at Lucid Imagination) • Simon Willnauer 
 (simonw at apache org, Committer on Lucene Java and Lucene
 Open Relevance Project) • Ryan McKinley (ryan at apache org, Committer on 
 Lucene 
 and Solr)
 
 • Robert Muir (rmuir at apache org, Committer on Lucene and Open Relevance) • 
 Sami Siren ( si...@a.o , Committer on Nutch and Tika)
 
 • Otis Gospodnetic ( o...@a.o , Committer on Lucene, Solr, Nutch, Mahout, and
 Open Relevance Project)
 
 • Shalin Shekhar Mangar ( sha...@a.o , AOL, Committer on Apache Solr)
 
 • Noble Paul ( no...@a.o , AOL, Committer on Apache Solr)
 
 • George Aroush (george at aroush.net, Committer on Lucene.Net)
 
 Sponsors
 
 Champion
 
 • Grant Ingersoll
 
 Nominated Mentors
 
 • Grant Ingersoll • Jukka Zitting • Gianugo Rabellino
 
 Sponsoring Entity
 
 • Apache Lucene PMC: Message ID: af7e...@gmail.com
 in priv...@lucene.a.o


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Graduate Lucene.Net as a subproject under Apache Lucene

2009-10-07 Thread Otis Gospodnetic

+1




- Original Message 
 From: George Aroush geo...@aroush.net
 To: general@incubator.apache.org
 Sent: Wed, October 7, 2009 9:59:43 PM
 Subject: [VOTE] Graduate Lucene.Net as a subproject under Apache Lucene
 
 Hi Folks,
 
 
 
 On behalf of Lucene.Net mentor, committers and community, this is a vote
 call to graduate the Lucene.Net project
 (http://incubator.apache.org/lucene.net/) as a sub-project under Apache
 Lucene.
 
 
 
 The Lucene.Net mentor, committers, and the community have voted like so:
 
 
 
   +1 from Erik Hatcher (mentor)
 
   +1 from George Aroush (committer)
 
   +1 from Isik YIGIT (aka: DIGY) (committer)
 
   +1 from Doug Sale (committer)
 
   +1 from a total of 70+ Lucene.Net members / followers / users.
 
 
 
 (with no -1 or 0 votes)
 
 
 
 The vote result can be found here:
 http://mail-archives.apache.org/mod_mbox/incubator-lucene-net-user/200909.mb
 ox/%3c166a01ca3739$13947380$3abd5a...@net%3e
 
 
 
 The rationale for graduation is:
 
   * Lucene.Net has been under incubation since April 2006 (3 1/2 years now).
 
   * During incubation, Lucene.Net has:
 
 - Made, 1 official release
 (Incubating-Apache-Lucene.Net-2.0-004-11Mar07).
 
 - Released, as SVN tag, 18 ports of Java Lucene (from 1.9 to 2.4.0).
 
 - Released, as SVN tag, port of WordNet.Net 2.0, SpellChecker.Net 2.0,
 Snowball.Net 2.0, and Highlighter.Net 2.0.
 
 - Released, MSDN style documentation for the above release.
 
 - Accepted, two new committers: Isik YIGIT (DIGY) digydigy @ gmail.com
 and Doug Sale dsale @ myspace-inc.com were added in November 2008 (George
 Aroush george @ aroush.net is the original committer).
 
 - The community has grown, with a healthy followers.
 
 - Is being used by well established companies in production (I'm not
 sure what's the legality to mention their names here, or even if I have the
 complete list).
 
 - Is being used by Beagle project.
 
   * Work is already under way to port Java Lucene 2.9 to Lucene.Net 2.9
 
 
 
 If this graduation is approved, Lucene.Net will be officially called Apache
 Lucene.Net
 
 
 
 Please cast your votes:
 
 [ ]  +1  Graduate Lucene.Net as a sub-project under Apache Lucene.
 
 [ ]  -1  Lucene.Net is not ready to graduate as a sub-project under Apache
 Lucene, because ...
 
 
 
 This vote will close on October 17th, 2009.
 
 
 
 Regards,
 
 
 
 -- George


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Wink proposal for incubation

2009-05-17 Thread Otis Gospodnetic


+1


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Nicholas L Gallardo nlgal...@us.ibm.com
 To: general@incubator.apache.org
 Cc: Bryant Luk b...@us.ibm.com; Christopher J Blythe cjbly...@us.ibm.com; 
 Dustin Amrhein damr...@us.ibm.com; Baram, Eliezer eba...@hp.com; 
 el...@hp.com; Greg Truty gtr...@us.ibm.com; Jesse A Ramos 
 jra...@us.ibm.com; Snitkovsky, Martin martin.snitkov...@hp.com; Michael 
 Rheinheimer r...@us.ibm.com; nadav.fisc...@hp.com; 
 tali.alsaigh-co...@hp.com; tomer.sh...@hp.com
 Sent: Friday, May 15, 2009 11:54:35 AM
 Subject: [VOTE] Accept Wink proposal for incubation
 
 
 Dear Incubator PMC Members,
 
 The Wink team would like to officially present the proposal for the Wink
 REST runtime for incubation in the Apache Incubator.  This proposal has
 been surfaced previously and is also available at:
 http://wiki.apache.org/incubator/WinkProposal 
 
 Please cast your votes:
 
 [ ] +1, Accept Wink for incubation
 [ ] +0, Indifferent to Wink incubation
 [ ] -1, Reject Wink for incubation (if so, please help us understand why)
 
 The formal proposal, included below, provides supporting details on why
 this proposal is coming forward and who is involved.
 
 Thanks and cheers on behalf of the team.
 
 
 
 - Abstract -
 
   Apache Wink is a project that enables development and consumption
   of REST style web services. The core server runtime is based on
   the JAX-RS (JSR 311) standard. The project also introduces a
   client runtime which can leverage certain components of the
   server-side runtime. Apache Wink will deliver component
   technology that can be easily integrated into a variety of
   environments.
 
 
 - Proposal -
 
   Apache Wink is a project that enables and simplifies development
   of REST style HTTP based services. The project includes both
   server and client side components that can be used independently
   of each other. The server side is a stand-alone component that
   integrates easily with many existing application servers. The
   client side API enables the user to develop applications that
   interact with server resources in a RESTful manner. The goal
   is to provide component technology for both RESTful services
   and clients that can be used in a number of contexts. These
   contexts could range from a full Java EE runtime environment
   (Geronimo) to a J2SE environment with a simple HTTP listener
   service.
 
   The server component of Apache Wink will implement a TCK compliant
   version of the JAX-RS standard defined by JSR 311
   (https://jsr311.dev.java.net/). The client side component provides
   a rich API for quickly developing applications that access and
   update server resources using JAX-RS requests. The API can
   accommodate data returned in several popular formats including
   JSON, XML, ATOM, HTML and CSV. Plans for future extensions are
   currently being discussed, but include a focus on ease of use
   through service discovery and quality of service configuration
   (security, caching).
 
 
 - Background -
 
   Over the past decade, the Representational State Transfer (REST)
   architectural style of web services has been gaining popularity.
   Introduced by Roy Fielding in 2000, the idea of providing simple
   HTTP based access to server resources has continued to grow even
   as other, more complex web service architectures have been
   published.
 
   The JSR 311 standard ( https://jsr311.dev.java.net) defines a
   standard set of annotations and a programming model for exposing
   java resources as REST-based resources. With the recent approval
   of the standard and its inclusion in Java EE 6, the use of REST
   and its Java programming standard (JAX-RS), will certainly be
   growing in the near future. As such, there will be a demand for
   an Apache friendly, open source implementation of the standard.
   Apache Wink seeks to provide this implementation in an independent
   manner that is not tied to any platform.
 
 
 - Rationale -
 
   The rationale for the project is to build an implementation of the
   JAX-RS specification in open source that can be certified by the
   applicable TCKS (JSR-311). The project would also provide
   integration with Geronimo and other open source-based REST
   communities. Building a strong, vendor-neutral community is
   important to the project so it that will outlast any one person's
   or company's participation. Code released from the project will
   also provide a basis to prototype and build new extensions that
   could eventually be taken for standardization as an extension to
   the JSR 311 work (such as a client API).
 
   However, the server side is only half of the equation. Once the
   server provides access to a resource, there needs to be clients to
   access and utilize the data. As such, we want to provide a well
   rounded package that also

Re: [PROPOSAL] Apache SocialSite

2009-04-23 Thread Otis Gospodnetic


Another +1 from me, too.  SocialSite needs to live on and Apache could be a 
good home for it.

 
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Jamey Wood jamey.w...@gmail.com
 To: general@incubator.apache.org
 Cc: Eduardo Pelegri-Llopart pele...@sun.com; rovagn...@gmail.com; Robert 
 Bissett robert.biss...@sun.com; leandro.milma...@globant.com; 
 rodr...@globant.com; Tony Ng tony...@sun.com
 Sent: Thursday, April 23, 2009 11:47:00 AM
 Subject: Re: [PROPOSAL] Apache SocialSite
 
 I'm very much +1 on this.  I appreciate Sun's willingness to contribute the
 existing SocialSite code, and I hope that we'll have the opportunity to
 evolve it under the Incubator's established governance model and level
 playing field.
 
 --Jamey
 
 On Wed, Apr 8, 2009 at 7:41 PM, Dave wrote:
 
  Greetings to all,
 
  It's my pleasure to present to you a proposal for a new project Apache
  SocialSite, a social networking service based on Apache Shindig
  (incubating) with an end-user interface composed entirely of
  OpenSocial gadgets and designed to add social networking features to
  existing web applications (e.g. Roller, JSPWiki, your favorite webapp,
  etc.). You can find the full proposal on the Incubator wiki:
 
   http://wiki.apache.org/incubator/SocialSiteProposal
 
  I look forward to your comments and suggestions on this proposal.
 
  Thanks,
  Dave
 


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: UIMA [WAS Re: Suspending Projects]

2009-02-22 Thread Otis Gospodnetic


My GSOC suggestion was related to the earlier comment that for some reason UIMA 
is unable to engage more contributors and convert them into committers (GSoC 
could help get some fresh blood), not the branch tied to the academia. :)

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Ross Gardler rgard...@apache.org
 To: general@incubator.apache.org
 Sent: Saturday, February 21, 2009 10:40:12 PM
 Subject: Re: UIMA [WAS Re: Suspending Projects]
 
 2009/2/21 Otis Gospodnetic :
 
  Perhaps GSOC is something to consider.  I see UIMA didn't have anything in 
 2008: http://wiki.apache.org/general/SummerOfCode2008
 
 GSoC is entirly separate from the academics, it is for students. The
 problems expressed in this thread are with respect to research staff
 who produce software as part of their work.
 
 I agree the GSoC mentoring idea can be used to do educate the staff
 within university. I'm currently seeking funding for doing something
 with this, but my existing project focuses on the research staff being
 discussed here.
 
 Ross
 
 
 
  Otis
  --
  Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
  - Original Message 
  From: Thilo Goetz 
  To: general@incubator.apache.org
  Sent: Friday, February 20, 2009 4:31:50 PM
  Subject: Re: UIMA [WAS Re: Suspending Projects]
 
  Niclas Hedhman wrote:
   On Fri, Feb 20, 2009 at 6:58 AM, Robert Burrell Donkin
   wrote:
  
   We should probably try to find the collective energy to review UIMA
   before the project's enthusiasm is sapped.
  
   That sounds like a healthy observation. My Q for the community is; Do
   you have a healthy and diverse set of users? If so, have the UIMA team
   looked at What is stopping these users from becoming contributors? ?
 
  Yes, we do have a healthy and diverse user community.  We have
  racked our brains what we could do to attract more community
  contribution.  We've created a sandbox to facilitate the inclusion
  of experimental technology.  There's been some uptake, but not
  enough.  Some of us are working on scale out via JMS and are
  hoping to attract contributions in that area.  We've started
  discussions and suggested things for people to work on.  I don't
  know, maybe we're going about this the wrong way.
 
  My pet hypothesis (or maybe I'm just looking for excuses) is
  this: UIMA is heavily used in academia.  Now academics have no
  problems with open source, to the contrary.  But they have an
  overwhelming need to publish and build up a reputation.  So
  they like to publish their source code on their own web site,
  where it's clear it's their work, rather than contribute to
  some community effort.  If you look around, you'll see all
  manner of university efforts around UIMA, but very little of
  that code finds its way back into the ASF repo.
 
  Enough whining.  If you have any suggestions, we'll be happy
  to hear them.
 
  --Thilo
 
   I could imagine a whole range of reasons, and if that is 'fixed' the
   diversity comes with it...
   If there is not a user community, then I would be concerned to
   graduate the project with the large set of single-employer committers.
  
  
   Cheers
   Niclas
 
 
  -
  To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
  For additional commands, e-mail: general-h...@incubator.apache.org
 
 
  -
  To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
  For additional commands, e-mail: general-h...@incubator.apache.org
 
 
 
 
 
 -- 
 --
 Ross Gardler
 
 OSS Watch - awareness and understanding of open source software
 development and use in education
 http://www.oss-watch.ac.uk
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: UIMA [WAS Re: Suspending Projects]

2009-02-20 Thread Otis Gospodnetic


Perhaps GSOC is something to consider.  I see UIMA didn't have anything in 
2008: http://wiki.apache.org/general/SummerOfCode2008


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Thilo Goetz twgo...@gmx.de
 To: general@incubator.apache.org
 Sent: Friday, February 20, 2009 4:31:50 PM
 Subject: Re: UIMA [WAS Re: Suspending Projects]
 
 Niclas Hedhman wrote:
  On Fri, Feb 20, 2009 at 6:58 AM, Robert Burrell Donkin
  wrote:
  
  We should probably try to find the collective energy to review UIMA
  before the project's enthusiasm is sapped.
  
  That sounds like a healthy observation. My Q for the community is; Do
  you have a healthy and diverse set of users? If so, have the UIMA team
  looked at What is stopping these users from becoming contributors? ?
 
 Yes, we do have a healthy and diverse user community.  We have
 racked our brains what we could do to attract more community
 contribution.  We've created a sandbox to facilitate the inclusion
 of experimental technology.  There's been some uptake, but not
 enough.  Some of us are working on scale out via JMS and are
 hoping to attract contributions in that area.  We've started
 discussions and suggested things for people to work on.  I don't
 know, maybe we're going about this the wrong way.
 
 My pet hypothesis (or maybe I'm just looking for excuses) is
 this: UIMA is heavily used in academia.  Now academics have no
 problems with open source, to the contrary.  But they have an
 overwhelming need to publish and build up a reputation.  So
 they like to publish their source code on their own web site,
 where it's clear it's their work, rather than contribute to
 some community effort.  If you look around, you'll see all
 manner of university efforts around UIMA, but very little of
 that code finds its way back into the ASF repo.
 
 Enough whining.  If you have any suggestions, we'll be happy
 to hear them.
 
 --Thilo
 
  I could imagine a whole range of reasons, and if that is 'fixed' the
  diversity comes with it...
  If there is not a user community, then I would be concerned to
  graduate the project with the large set of single-employer committers.
  
  
  Cheers
  Niclas
 
 
 -
 To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
 For additional commands, e-mail: general-h...@incubator.apache.org


-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: UIMA [WAS Re: Suspending Projects]

2009-02-20 Thread Otis Gospodnetic


+1.
We have this attitude over at Lucene and I think it works well.  We also have 
HowToContribute pages for both Lucene and Solr and regularly point people to it.
We also encourage contribution via Perhaps you can open a JIRA issue and 
contribute your patch type of suggestions on the ML.

We give credit to all contributors and always thank them.

Lucene has been around for about 10 years now.  It's a very healthy and active 
project.  It now has lots and lots of contributors, but it took some time to 
get there.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: William A. Rowe, Jr. wr...@rowe-clan.net
 To: general@incubator.apache.org
 Sent: Saturday, February 21, 2009 12:57:09 AM
 Subject: Re: UIMA [WAS Re: Suspending Projects]
 
 Robert Burrell Donkin wrote:
  
  I know that we usually try to strongly encourage companies not to use
  apache as dumping ground but I wonder sometimes whether it might be
  useful to accept more contributions of proof-of-concept code
  especially from academia. It's often easier to start from some proof-
  of-concept code than from scratch.
 
 +1 - let's please not forget that NCSA httpd was exactly that, a code
 dump (well, fork/import) of an abandoned work :)

-
To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org
For additional commands, e-mail: general-h...@incubator.apache.org

Re: [VOTE] Accept Cassandra into the Incubator

2008-12-23 Thread Otis Gospodnetic

+1


Otis--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Ian Holsman li...@holsman.net
 To: general@incubator.apache.org
 Sent: Tuesday, December 23, 2008 5:01:37 PM
 Subject: [VOTE] Accept Cassandra into the Incubator
 
 
 Dear Incubator PMC,
 
 There has been some discussion around the Cassandra proposal,
 and we would now like to officially propose Cassandra to the Incubator
 for consideration..
 
 Please vote on accepting Cassandra project for incubation. The full
 Cassandra proposal is available at the end of this message and as a wiki
 page at http://wiki.apache.org/incubator/Cassandra. We ask the
 Incubator PMC to sponsor the Cassandra podling, with Brian as the Champion, 
 and 
 Torsten, Matthieu, and Ian volunteering to mentor as well.
 
 The vote is open for the next 72 hours and only votes from the
 Incubator PMC are binding.
 
 [ ] +1 Accept Cassandra as a new podling
 [ ] -1 Do not accept the new podling (provide reason, please)
 
 
 = Abstract =
 
 Cassandra is a distributed storage system for managing 
 structured/unstructured 
 data while providing reliability at a massive scale.
 
 = Background =
 
 Development of Cassandra started in Facebook in June 2007. It started of a 
 system to solve the Inbox Search problem and since then has
 matured to solve various storage problems associated with 
 structured/unstructured data.
 
 = Rationale =
 
 Cassandra is a distributed storage system for managing structured data that 
 is 
 designed to scale to a very large size across many commodity servers, with no 
 single point of failure.
 The philosophy behind the design of the storage portion of Cassandra is that 
 it 
 be able to satisfy the requirements of applications that demand storage of 
 large 
 amounts of structured data. Reliability at massive scale is a very big 
 challenge. Outages in the service can have significant negative impact. Hence 
 Cassandra aims to run on top of an infrastructure of hundreds of nodes  
 (possibly spread across different datacenters). At this scale, small and 
 large 
 components fail continuously; the way Cassandra manages the persistent state 
 in 
 the face of these failures drives the reliability and scalability of the 
 software systems relying on this service.
 
 = Initial Source =
 
 Intial Source can be obtained from the following site - 
 http://the-cassandra-project.googlecode.com/svn/branches/development/. The 
 mailing list is currently maintained at the same site.
 We will move it over to Apache once this proposal has been accepted.
 
 = Source and Intellectual Property Submission Plan =
 
 = External Dependencies =
 * All dependencies have Apache compatible licenses. Dependencies are log4j, 
 Thrift, Apache Commons.
 
 = Cryptography =
 * None
 = Committers =
 
 * Avinash Lakshman
 * Prashant Malik
 * Kannan Muthukkaruppan
 * Jiansheng Huang
 * Dan Dumitriu
 
 = Current Status =
 == Meritocracy ==
 
 * Though initial development was done at Facebook, Cassandra was intended to 
 be 
 released as an open source project from its inception. Environment will lend 
 itself to support meritocracy at all times.
 
 == Community ==
 * Folks who are actively considering deploying/prototyping Cassandra in their 
 respective organizations.
 
 == Core Developers ==
 * Avinash Lakshman
 * Prashant Malik
 * Kannan Muthukkaruppan
 
 == License ==
 * The Cassandra codebase is Apache 2.0 licensed, and currently hosted at 
 Google 
 Code.
 = Known Risks/Avoiding the Warning Signs =
 
 == Orphaned Products ==
 * Cassandra is already deployed within Facebook and many other organizations 
 are 
 actively moving to deploy this in production. Original developers
 are and will actively stay involved and hence there is no realistic chance of 
 it 
 getting orphaned.
 
 == Homogenous Developers ==
 * The current list of committers includes developers from different 
 companies. 
 The committers are geographically distributed across the U.S.
 
 == Reliance on Salaried Developers ==
 * Yes. But don't expect this to be a risk of any nature.
 
 == Relationships with Other Apache Products ==
 * The Cassandra project is 'similar' to hbase/HDFS in concept, but Cassandra 
 is 
 more geared for Online web site usage than batch. It also doesn't have a 
 single 
 point of failure, which makes it interesting as well.
 
 * Cassandra makes use of the Thrift project.
 
 == An excessive fascination with the Apache brand ==
 * Cassandra has already attracted a stable base of users. There are at least 
 3 
 companies who are planning to use Cassandra in production as far as we know. 
 The 
 reasons for joining Apache are not to advertise the project, but rather to 
 demonstrate the commitment to open source by divorcing the trunk from any one 
 corporation and pursuing further integration with other Apache projects.
 
 = Required Resources =
 == Mailing lists ==
 
 Once the project is approved, the following mailing lists will be used

Re: Cassandra Incubator Proposal

2008-12-04 Thread Otis Gospodnetic

Hello,

The distributed storage system for managing structured/unstructured data while 
providing reliability at a massive scale. part sounds kind of like HDFS.  
Would it be possible to describe how Cassandra is different from HDFS?  Perhaps 
the best place to do it is under the Relationships with Other Apache Products 
section.

Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Avinash Lakshman [EMAIL PROTECTED]
 To: general@incubator.apache.org general@incubator.apache.org
 Cc: Prashant Malik [EMAIL PROTECTED]; Kannan Muthukkaruppan [EMAIL 
 PROTECTED]
 Sent: Monday, December 1, 2008 7:51:28 PM
 Subject: Cassandra Incubator Proposal
 
 Hi Folks
 
 Please consider our proposal to move the Cassandra project into the 
 Incubation 
 process - http://wiki.apache.org/incubator/Cassandra. Please advice, as to 
 what 
 else is required for us complete this process.
 
 Cheers
 Avinash


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [Vote] accept Droids into incubation

2008-10-03 Thread Otis Gospodnetic

+1

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Thorsten Scherler [EMAIL PROTECTED]
 To: Incubator general@incubator.apache.org
 Sent: Thursday, October 2, 2008 4:00:41 PM
 Subject: [Vote] accept Droids into incubation
 
 Please vote on accepting Droids into incubation.
 
 The proposal can be found at:
 http://wiki.apache.org/incubator/DroidsProposal
 
 The text of the proposal
 
 = Droids, an intelligent standalone robot framework =
 
 === Abstract ===
 
 Droids aims to be an intelligent standalone robot framework that allows
 to create and extend existing droids (robots).
 
 === Proposal ===
 
 As a standalone robot framework Droids will offer infrastructure code to
 create and extend existing robots. In the future it will offer as well a
 web based administration application to manage and controll the
 different droids which will communicate with this app.
 
 Droids makes it very easy to extend existing robots or write a new one
 from scratch, which can automatically seek out relevant online
 information based on the user's specifications. Since the flexible
 design it can reuse directly all custom business logic that are written
 in java.
 
 In the long run it should become umbrella for specialized droids that
 are hosted as sub-projects. Where an ultimate goal is to integrate an
 artificial intelligence that can control a swarm of droids and actively
 plan/react on different tasks.
 
 === Background ===
 
 The initial idea for the Droids project was voiced in February 2007 from
 Thorsten Scherler mainly because of personal curiosity and developed as
 a labs project. The background of his work was that Cocoon trunk (2.2)
 did not provide a crawler anymore and Forrest was based on it, meaning
 we could not update anymore till we found a crawler replacement. Getting
 more involved in Solr and Nutch he saw the request for a generic
 standalone crawler.
 
 For the first version he took nutch, ripped out and modified the
 plugin/extension framework. However the second version were not based on
 it anymore but was using Spring instead. The main reason was that Spring
 has become a standard and helped to make Droids as extensible as
 possible.
 
 Soon the first plugins and sample droids had been added to the code
 based.
 
 === Rationale ===
 
 There is ever more demand for tools that automatically do determinate
 tasks. Search engines such as Nuts are normally very focused on a
 specific functionality and are not focused on extensibility. Furthermore
 there are manly focused on crawling, requesting certain pages and
 extract links to other pages, which in our opinion is only one small
 area for automated robots. While there are a number of existing crawler
 libraries for various task, each of them comes with a custom API and
 there are no generic interface for automatically determining which
 crawler (droids) to use for a specific task. 
 
 The Droids project attempts to remove this duplication of efforts. We
 believe that by pooling the efforts of multiple projects we will be able
 to create a generic robot framework that exceeds the capabilities and
 quality of the custom solutions of any single project. The focus of
 Droids is not a single crawler but more to offer different reusable
 components that custom droids (robots) can use to automate certain
 tasks. An intelligent standalone robot framework project will not only
 provide common ground for the developers of crawler but as well for any
 other automated application (robots) libraries. 
 
 === Initial Goals ===
 
 The initial goals of the proposed project are:
 
 * Viable community around the Droids codebase
 * Active relationships and possible cooperation with related projects
 and communities (e.g. reusing Tika for text extraction)
 * Generic robot API for crawling, extracting structured text content
 and/or new task, filtering task and handle the content
 * Flexible extension and plugin development to create a wide range of
 functionality
 * Fuel develop of various droids and bring the current wget style
 crawler to state-of-the-art level
 
 == Current Status ==
 
 === Meritocracy ===
 
 All the initial committers are familiar with the meritocracy principles
 of Apache, and have already worked on the various source codebases. We
 will follow the normal meritocracy rules also with other potential
 contributors.
 
 === Community ===
 
 There is not yet a clear Droids community. Instead we have a number of
 people and related projects with an understanding that an intelligent
 standalone robot framework project would best serve everyone's
 interests. The primary goal of the incubating project is to build a
 self-sustaining community around this shared vision.
 
 === Core Developers ===
 
 The initial set of developers comes from various backgrounds, with
 different but compatible needs for the proposed project.
 
 === Alignment ===
 
 As a generic robot framework Droids will likely be

Re: [PROPOSAL] Droids

2008-09-22 Thread Otis Gospodnetic

This sounds good to me.
Are you planning to run Droids on top of Hadoop?  If not, why not?


Thanks,
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Thorsten Scherler [EMAIL PROTECTED]
 To: general@incubator.apache.org
 Sent: Monday, September 22, 2008 4:24:55 PM
 Subject: [PROPOSAL] Droids
 
 This is a proposal to enter the incubator.
 
 See http://wiki.apache.org/incubator/DroidsProposal for the most
 up-to-date version.
 
 As Champion we have Grant Ingersoll from
 the ASF.
 
 Droids is an Apache Labs project and we are still looking for some
 mentors for this proposal.
 
 We look forward to comments and discussion.
 
 = Droids, an intelligent standalone robot framework =
 
 === Abstract ===
 
 Droids aims to be an intelligent standalone robot framework that allows
 to create and extend existing droids (robots).
 
 === Proposal ===
 
 As a standalone robot framework Droids will offer infrastructure code to
 create and extend existing robots. In the future it will offer as well a
 web based administration application to manage and controll the
 different droids which will communicate with this app.
 
 Droids makes it very easy to extend existing robots or write a new one
 from scratch, which can automatically seek out relevant online
 information based on the user's specifications. Since the flexible
 design it can reuse directly all custom business logic that are written
 in java.
 
 In the long run it should become umbrella for specialized droids that
 are hosted as sub-projects. Where an ultimate goal is to integrate an
 artificial intelligence that can control a swarm of droids and actively
 plan/react on different tasks.
 
 === Background ===
 
 The initial idea for the Droids project was voiced in February 2007 from
 Thorsten Scherler mainly because of personal curiosity and developed as
 a labs project. The background of his work was that Cocoon trunk (2.2)
 did not provide a crawler anymore and Forrest was based on it, meaning
 we could not update anymore till we found a crawler replacement. Getting
 more involved in Solr and Nutch he saw the request for a generic
 standalone crawler.
 
 For the first version he took nutch, ripped out and modified the
 plugin/extension framework. However the second version were not based on
 it anymore but was using Spring instead. The main reason was that Spring
 has become a standard and helped to make Droids as extensible as
 possible.
 
 Soon the first plugins and sample droids had been added to the code
 based.
 
 === Rationale ===
 
 There is ever more demand for tools that automatically do determinate
 tasks. Search engines such as Nuts are normally very focused on a
 specific functionality and are not focused on extensibility. Furthermore
 there are manly focused on crawling, requesting certain pages and
 extract links to other pages, which in our opinion is only one small
 area for automated robots. While there are a number of existing crawler
 libraries for various task, each of them comes with a custom API and
 there are no generic interface for automatically determining which
 crawler (droids) to use for a specific task. 
 
 The Droids project attempts to remove this duplication of efforts. We
 believe that by pooling the efforts of multiple projects we will be able
 to create a generic robot framework that exceeds the capabilities and
 quality of the custom solutions of any single project. The focus of
 Droids is not a single crawler but more to offer different reusable
 components that custom droids (robots) can use to automate certain
 tasks. An intelligent standalone robot framework project will not only
 provide common ground for the developers of crawler but as well for any
 other automated application (robots) libraries. 
 
 === Initial Goals ===
 
 The initial goals of the proposed project are:
 
 * Viable community around the Droids codebase
 * Active relationships and possible cooperation with related projects
 and communities (e.g. reusing Tika for text extraction)
 * Generic robot API for crawling, extracting structured text content
 and/or new task, filtering task and handle the content
 * Flexible extension and plugin development to create a wide range of
 functionality
 * Fuel develop of various droids and bring the current wget style
 crawler to state-of-the-art level
 
 == Current Status ==
 
 === Meritocracy ===
 
 All the initial committers are familiar with the meritocracy principles
 of Apache, and have already worked on the various source codebases. We
 will follow the normal meritocracy rules also with other potential
 contributors.
 
 === Community ===
 
 There is not yet a clear Droids community. Instead we have a number of
 people and related projects with an understanding that an intelligent
 standalone robot framework project would best serve everyone's
 interests. The primary goal of the incubating project is to build a
 self-sustaining community around this shared

Re: [PROPOSAL] Etch

2008-08-08 Thread Otis Gospodnetic

Grant is right.  Others were making the point about Debian.  I was making a 
point about it being an overly generic English word.  The former may be a short 
term problem, the latter a very long term one.  If changing the name is such a 
problem (work) then...


Otis
P.S.
I, too, like the name Etch, it's just that it's an English word, plus there are 
several other products with Etch in their name.



- Original Message 
 From: Grant Ingersoll [EMAIL PROTECTED]
 To: general@incubator.apache.org
 Sent: Friday, August 8, 2008 6:28:23 AM
 Subject: Re: [PROPOSAL] Etch
 
 
 On Aug 8, 2008, at 4:28 AM, James Dixson (jadixson) wrote:
 
  Simple put: a name change is work. Before I can accept the need to do
  work, I want to clearly understand the benefits of doing it.
 
  Etch, while new to open-source, does have some awareness in a  
  technical
  community ( http://developer.cisco.com/web/cuae ). We have been  
  publicly
  pitching and distributing etch in our community for several months  
  now.
  People have been using the technology and for our current community  
  Etch
  != Debian. Granted, a couple of months is a short amount of time,  
  but it
  is something. Imposing a name change on our current community, with  
  the
  reasoning that the future community, would be unable to differentiate
  between Apache Etch and the etch release Debian, would be  
  disruptive.
 
 I don't think the argument is necessarily that the future community  
 can't distinguish between Apache Etch and Debian, I think the argument  
 is that the future community won't be able to find it, period, which  
 means the future community may well be smaller than it would be w/ a  
 more distinctive name.
 
 Put it this way, you search for Hadoop, the top 10 on Google is all  
 Apache Hadoop.  You search for Etch and you will be lucky to crack the  
 top 10, me thinks, but who knows maybe you'll get enough rank to  
 displace the Etch-a-Sketch and it will be a non-issue.
 
 Of course, the work thing I understand, too, although it seems like a  
 global search and replace wouldn't be that bad.  You also certainly  
 could change it over time, even after being accepted into incubation,  
 I think, just as long as it's done before first release.
 
 FWIW, I like the name Etch :-)
 
 -Grant 
   
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [PROPOSAL] Etch

2008-08-07 Thread Otis Gospodnetic

http://www.google.com/search?q=etch --  12MM hits for an unreleased 
product (this Etch)
http://www.google.com/search?q=hadoop --  600K hits for a product that's been 
out for over a year

Why such resistance to name change?  There is this proverb that might be 
suitable here.  The loose translation is:
When one person tells you you are a donkey, ignore him.  When two people tell 
you you are a donkey, go buy yourself a saddle.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: Scott Comer (sccomer) [EMAIL PROTECTED]
 To: general@incubator.apache.org; general@incubator.apache.org
 Sent: Thursday, August 7, 2008 2:50:37 PM
 Subject: Re: [PROPOSAL] Etch
 
 Doug is a wise man and that is how we picked the name etch 18 month ago.
 
 Scott out
 
 
 
 
 
 -Original Message-
 From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
 Sent:Wednesday, August 06, 2008 12:12 PM Pacific Standard Time
 To:general@incubator.apache.org
 Subject:Re: [PROPOSAL] Etch
 
 Doug Cutting has a few nice and short naming rules that I liked when I read 
 them.  I believe one of them was that a Google search for the proposed name 
 should yield very few matches.  Hadoop and Lucene are/were good examples of 
 that.  Here is another naming example.  I created this bookmarking service 
 called simpy - simpy.com .  Great name - short, memorable, easy to spell, 
 etc. 
 people said.  The problem with it is that simpy is a common misspelling of 
 simply.  So, another thing to keep in mind.
 
 
 Otis
 --
 Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
 
 
 
 - Original Message 
  From: James Dixson (jadixson) 
  To: general@incubator.apache.org
  Sent: Tuesday, August 5, 2008 6:03:28 PM
  Subject: RE: [PROPOSAL] Etch
  
  I have heard the name concern a couple of times now...
  
  When we picked the name Etch about 18 months ago, we knew about the
  Debian release, but frankly we were unconcerned. Debian etch is the name
  of a release of Debian, no different than 4 being a release of Fedora.
  Eventually Debian etch will fade into memory just as sarge and woody
  have.
  
  Etch, in our line of thinking (if it could be said we were doing any
  thinking :-) ) was the name we were giving to the technology, not a
  release. There were no other technologies named Etch or similar, so we
  declared victory and moved on. 
  
  So I guess my question back to everyone is this:
  
 What is the concern about the name Etch, really?
  
  1. Is there a legal trademark issue or a formal Apache branding policy
  issue with using the name Etch such that the use of the name is simply
  not going to be allowed.
  
  - or -
  
  2. Is there a concern that during incubation, we might have to be
  explicit in communication and always say Apache Etch rather than just
  Etch because of a fear that a reference to Etch, taken out of
  context, could be confused with Debian 4.0?
  
  If 1 is true then, absolutely the name should be changed.
  
  But if only 2 is true, then I will need a bit more convincing. I am
  after all very, very lazy and changing the name of a working toolset is
  well... work :-)
  
  
  ---
  James Dixson
  Manager, Software Development
  CUAE Engineering, Cisco Systems Inc.
  Direct: 512-336-3305
  Mobile: 512-968-2116
  [EMAIL PROTECTED]
  
  -Original Message-
  From: Niklas Gustavsson [mailto:[EMAIL PROTECTED] 
  Sent: Tuesday, August 05, 2008 3:33 PM
  To: general@incubator.apache.org
  Subject: Re: [PROPOSAL] Etch
  
  On Thu, Jul 31, 2008 at 6:16 PM, James Dixson 
  wrote:
   This a proposal to enter Etch in to the incubator.
  
   See http://wiki.apache.org/incubator/EtchProposal for updates.
  
  +1 for incubation (non-binding). While I find this area to be a bit
  overcrowded lately, having both Etch and Thrift at Apache and Protocol
  buffers under ASL 2.0 does offer some interesting opportunities for
  competition as well as cooperation.
  
  I do share the concerns about naming conflicts. Debian is by far more
  well known and trying to establish this project under a conflicting
  name would be hard.
  
  /niklas
  
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
  
  
  -
  To unsubscribe, e-mail: [EMAIL PROTECTED]
  For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [PROPOSAL] Etch

2008-08-06 Thread Otis Gospodnetic

Doug Cutting has a few nice and short naming rules that I liked when I read 
them.  I believe one of them was that a Google search for the proposed name 
should yield very few matches.  Hadoop and Lucene are/were good examples of 
that.  Here is another naming example.  I created this bookmarking service 
called simpy - simpy.com .  Great name - short, memorable, easy to spell, 
etc. people said.  The problem with it is that simpy is a common misspelling 
of simply.  So, another thing to keep in mind.

 
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
 From: James Dixson (jadixson) [EMAIL PROTECTED]
 To: general@incubator.apache.org
 Sent: Tuesday, August 5, 2008 6:03:28 PM
 Subject: RE: [PROPOSAL] Etch
 
 I have heard the name concern a couple of times now...
 
 When we picked the name Etch about 18 months ago, we knew about the
 Debian release, but frankly we were unconcerned. Debian etch is the name
 of a release of Debian, no different than 4 being a release of Fedora.
 Eventually Debian etch will fade into memory just as sarge and woody
 have.
 
 Etch, in our line of thinking (if it could be said we were doing any
 thinking :-) ) was the name we were giving to the technology, not a
 release. There were no other technologies named Etch or similar, so we
 declared victory and moved on. 
 
 So I guess my question back to everyone is this:
 
What is the concern about the name Etch, really?
 
 1. Is there a legal trademark issue or a formal Apache branding policy
 issue with using the name Etch such that the use of the name is simply
 not going to be allowed.
 
 - or -
 
 2. Is there a concern that during incubation, we might have to be
 explicit in communication and always say Apache Etch rather than just
 Etch because of a fear that a reference to Etch, taken out of
 context, could be confused with Debian 4.0?
 
 If 1 is true then, absolutely the name should be changed.
 
 But if only 2 is true, then I will need a bit more convincing. I am
 after all very, very lazy and changing the name of a working toolset is
 well... work :-)
 
 
 ---
 James Dixson
 Manager, Software Development
 CUAE Engineering, Cisco Systems Inc.
 Direct: 512-336-3305
 Mobile: 512-968-2116
 [EMAIL PROTECTED]
 
 -Original Message-
 From: Niklas Gustavsson [mailto:[EMAIL PROTECTED] 
 Sent: Tuesday, August 05, 2008 3:33 PM
 To: general@incubator.apache.org
 Subject: Re: [PROPOSAL] Etch
 
 On Thu, Jul 31, 2008 at 6:16 PM, James Dixson 
 wrote:
  This a proposal to enter Etch in to the incubator.
 
  See http://wiki.apache.org/incubator/EtchProposal for updates.
 
 +1 for incubation (non-binding). While I find this area to be a bit
 overcrowded lately, having both Etch and Thrift at Apache and Protocol
 buffers under ASL 2.0 does offer some interesting opportunities for
 competition as well as cooperation.
 
 I do share the concerns about naming conflicts. Debian is by far more
 well known and trying to establish this project under a conflicting
 name would be hard.
 
 /niklas
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [DISCUSSION] Hama Proposal

2008-03-19 Thread Otis Gospodnetic

Edward,

I was going to email you about this weeks ago, when I first saw this proposal.  
You are working in a vacuum too much, I think that's the main problem.  You are 
mentioning private e-mails, and that doesn't sound right.  Bring this up in the 
open on [EMAIL PROTECTED] and state what you'd like.  I, like Yonik and Grant, 
feel that this fits very well under Hadoop, either as a sub-project or a simply 
a contrib.

I *believe* that if you go the sub-project route, and especially if you simply 
make Hama a Hadoop contrib, no incubation is necessary, as long as Hadoop PMC 
welcomes the code.  Much simpler.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
From: edward yoon [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Tuesday, March 18, 2008 7:02:23 PM
Subject: Re: [DISCUSSION] Hama Proposal

 Do you have a mail thread reference?

 This seems small enough in scope and so tied to Hadoop that it seems
 like it should either just be part of one of the hadoop sub-projects or
 at a maximum, a hadoop sub-project of it's own.

http://www.mail-archive.com/[EMAIL PROTECTED]/msg00136.html

But, i much talked about it via private e-mail.
They gave me a welcome, However, They also all share the need to make
incubation progress.

Thanks,
Edward.

On 3/19/08, Yonik Seeley [EMAIL PROTECTED] wrote:
 This seems small enough in scope and so tied to Hadoop that it seems
 like it should either just be part of one of the hadoop sub-projects or
 at a maximum, a hadoop sub-project of it's own.

 I see you opened
 https://issues.apache.org/jira/browse/HADOOP-2878
 what's the status of that?

 -Yonik

 On Tue, Mar 18, 2008 at 8:02 AM, edward yoon [EMAIL PROTECTED] wrote:
  Dear Incubator PMC,
   I've updated the Hama project proposal. Please review/update as needed
   and report back any concerns.
 
   http://wiki.apache.org/incubator/HamaProposal
 
   Hama has a strong relationship with the hadoop, hbase and mahout
   project, so i discussed about become a sub-project of these project
   for a long time with the each community. However, The sub-project was
   beset with difficulties. Hence the list of committers, etc. And now we
   all agree that the Hama should aim to general purpose rathen than it
   becomes a specified piece in something.
 
   
  http://www.nabble.com/-jira--Created%3A-%28MAHOUT-16%29-Hama-contrib-package-for-the-mahout-to15998717.html
   https://issues.apache.org/jira/browse/HADOOP-2878
 
   If you think this will make a good ASF project, please encourage our
   team members to create the world's largest matrix computational
   framework.
 
   Thanks.
   B. Regards,
   Edward yoon @ NHN, corp.
 
   -
   To unsubscribe, e-mail: [EMAIL PROTECTED]
   For additional commands, e-mail: [EMAIL PROTECTED]
 
 

 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]




-- 
B. Regards,
Edward yoon @ NHN, corp.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: hit counters for incubating web sites

2008-01-15 Thread Otis Gospodnetic

Ah, good question (don't have the answer).  Personally, I'd love to see 
Apache.org stats via something like Google Analytics, but perhaps this can be a 
per-project thing.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
From: Marshall Schor [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Tuesday, January 15, 2008 11:27:01 AM
Subject: hit counters for incubating web sites

Various posts in the past have expressed interest in collecting 
statistics on usage, or downloads.  Previous replies have pointed out 
that counting downloads is inaccurate, because Apache licensed 
components can be redistributed by others, and the Apache mirroring 
system means that most downloads occur from non-Apache machines.

We would like to get some statistical information about downloads, and 
are thinking that counting clicks on the download button(s) would be a 
good way (it would avoid the problem of missing mirroring).  Although 
not perfect (it would miss repackaging/redistribution, and other sites 
which link to a download page other than our own), we think it would be
 
somewhat useful, at least as a lower bound of interest.

Vadim Gritsenko has a stats site on people.a.o, looking at downloads by
 
extracting data from web server logs.  He has said, however, that he 
won't track individual incubator projects, just TLP.  See 
http://people.apache.org/~vgritsenko/stats/index.html and 
http://people.apache.org/~vgritsenko/faq.html .

Most of the hit counters out there seem to be snippets of html you add 
to your web page, which go off to someone else's server, where the 
counting happens. 

Is there a service running on an apache server (e.g., people.a.o),
 which 
we can use for hit-counting?  If so, can someone post the html needed
 to 
use it?

-Marshall

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [DISCUSS] PDFBox proposal

2007-11-27 Thread Otis Gospodnetic

Sounds like a good addition to ASF!

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
From: Jukka Zitting [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Wednesday, November 14, 2007 8:08:33 PM
Subject: [DISCUSS] PDFBox proposal

Hi,

Ben Litchfield, the author of the PDFBox library, has been working
with us at the ApacheCon preparing a proposal to bring PDFBox into the
Apache Incubator. See http://wiki.apache.org/incubator/PDFBoxProposal
for the current draft of the proposal.

Some of the details are yet to be worked out, but the general idea is
there. All comments and questions are welcome!

BR,

Jukka Zitting

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [PROPOSAL] Shindig, an OpenSocial Container

2007-11-27 Thread Otis Gospodnetic

+1 -- my simpy.com might become a client soon. :)

Otis 

--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
From: Brian McCallister [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Friday, November 9, 2007 1:03:49 PM
Subject: [PROPOSAL] Shindig, an OpenSocial Container

Shindig Proposal
--

= Abstract =

Shindig will develop the container and backend server components
for hosting OpenSocial applications.

= Proposal =

Shindig will develop a JavaScript container and implementations of
the backend APIs and proxy required for hosting OpenSocial
 applications.

= Background =

OpenSocial provides a common set of APIs for social applications
across multiple websites. With standard JavaScript and HTML,
developers can create social applications that use a social network's
friends and update feeds.

A social application, in this context, is an application run by a
third party provider and embedded in a web page, or web application,
which consumes services provided by the container and by the
application host. This is very similar to Portal/Portlet technology,
but is based on client-side compositing, rather than server.

More information can be found about OpenSocial at
http://code.google.com/apis/opensocial/

== Rationale ==

Shindig is an implementation of an emerging set of APIs for client-side
composited web applications. The Apache Software Foundation has
proven to have developed a strong system and set of mores for
building community-centric, open standards based systems with a
wide variety of participants.

A robust, community-developed implementation of these APIs will
encourage compatibility between service providers, ensure an excellent
implementation is available to everyone, and enable faster and
easier application development for users.

The Apache Software Foundation has proven it is the best place for
this type of open development.

= Current Status =

This is a new project.

= Meritocracy =

The initial developers are very familiar with meritocratic open
source development, both at Apache and elsewhere. Apache was chosen
specifically because the initial developers want to encourage this
style of development for the project.

=== Community ===

Shindig seeks to develop developer and user communities during
incubation.

= Core Developers =

The initial core developers are all Ning employees. We hope to
expand this very quickly.

= Alignment =

The developers of Shindig want to work with the Apache Software
Foundation specifically because Apache has proven to provide a
strong foundation and set of practices for developing standards-based
infrastructure and server components.

= Known Risks =

== Orphaned products ==

Shindig is new development of an emerging set of APIs.

== Inexperience with Open Source ==

The initial developers include long-time open source developers,
including Apache Members.

== Homogenous Developers ==

The initial group of developers is quite homogenous. Remedying this
is a large part of why we want to bring the project to Apache.

== Reliance on Salaried Developers ==

The initial group of developers are employed by a potential consumer
of the project. Remedying this is a large part of why we want to
bring the project to Apache.

== Relationships with Other Apache Products ==

None in particular, except that Apache HTTPD is the best place to
run PHP, which the server-side components Ning intends to donate
have been implemented in.

==  A Excessive Fascination with the Apache Brand ==

We believe in the processes, systems, and framework Apache has put
in place. The brand is nice, but is not why we wish to come to
Apache.

= Documentation =

Google's OpenSocial Documentation:
 http://code.google.com/apis/opensocial/

Ning's OpenSocial Documentation:
 http://tinyurl.com/3y5ckx

= Initial Source =

Ning, Inc. intends to donate code based on their implementation of
OpenSocial. The backend systems will be replaced with more generic
equivalents in order to not bind the implementation to specifics
of the Ning platform.

This code will be extracted from Ning's internal development, and
has not been expanded on past the extraction. It will be provided
primarily as a starting place for a much more robust, community- 
developed
implementation.

= External Dependencies =

The initial codebase relies on a library created by Google, Inc.,
and licensed under the Apache Software License, Version 2.0.

= Required Resources =

Developer and user mailing lists

A subversion repository

A JIRA issue tracker

= Initial Committers =

 Thomas Baker[EMAIL PROTECTED]
 Tim Williamson  [EMAIL PROTECTED]
 Brian McCallister   [EMAIL PROTECTED]
 Thomas Dudziak  [EMAIL PROTECTED]
 Martin Traverso [EMAIL PROTECTED]

= Sponsors =

== Champion ==

 Brian McCallister   [EMAIL PROTECTED]

== Nominated Mentors ==

 Brian McCallister   [EMAIL PROTECTED]
 Thomas Dudziak  [EMAIL PROTECTED]

Re: Incubator Proposal: Pig

2007-09-20 Thread Otis Gospodnetic

Big +1! :)

Otis
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Olga Natkovich [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Tuesday, September 18, 2007 3:52:23 PM
Subject: Incubator Proposal: Pig

Hi,

Yahoo! research and development teams have developed a proposal below. The
proposal is also available on wiki at
http://wiki.apache.org/incubator/PigProposal
http://wiki.apache.org/incubator/PigProposal.
We would like to ask that the ASF consider forming a podling according to
the proposal.

Thanks,

Olga Natkovich
 mailto:[EMAIL PROTECTED] [EMAIL PROTECTED]

-

= Pig Open Source Proposal =

== Abstract ==

Pig is a platform for analyzing large data sets. 

== Proposal ==

The Pig project consists of high-level languages for expressing data
analysis programs, coupled with infrastructure for evaluating these
programs. The salient property of Pig programs is that their structure is
amenable to substantial parallelization, which in turns enables them to
handle very large data sets.

At the present time, Pig's infrastructure layer consists of a compiler that
produces sequences of Map-Reduce programs, for which large-scale parallel
implementations already exist (e.g., the Hadoop subproject). Pig's language
layer currently consists of a textual language called Pig Latin, which has
the following key properties:

 1. ''Ease of programming''. It is trivial to achieve parallel execution of
simple, embarrassingly parallel data analysis tasks. Complex tasks
comprised of multiple interrelated data transformations are explicitly
encoded as data flow sequences, making them easy to write, understand, and
maintain.
 2. ''Optimization opportunities''. The way in which tasks are encoded
permits the system to optimize their execution automatically, allowing the
user to focus on semantics rather than efficiency.
 3. ''Extensibility''. Users can create their own functions to do
special-purpose processing. 

== Background ==

Pig started as a research project at Yahoo! in May of 2006 to combine ideas
in parallel databases and distributed computing. The first internal release
took place in July 2006. The first release was a simple front-end to the
Hadoop Map/Reduce framework. The following releases added new features and
evolved the language based on user feedback. In July 2007, pig was taken
over by a development team and the first production version is due to be
released on 9/28/07.

Since its inception, we had observed a steady growth of the user community
within Yahoo!.  In April 2007, Pig was released under a BSD-type license.
Several external parties are using this version and have expressed interest
in collaborating on its development.

== Rationale ==

In an information-centric world, innovation is driven by ad-hoc analysis of
large data sets. For example, search engine companies routinely deploy and
refine services based on analyzing the recorded behavior of users,
publishers, and advertisers. The rate of innovation depends on the
efficiency with which data can be
analyzed.

To analyze large data sets efficiently, one needs parallelism. The cheapest
and most scalable form of parallelism is cluster computing. Unfortunately,
programming for a cluster computing environment is difficult and
time-consuming. Pig makes it easy to harness the power of cluster computing
for ad-hoc data analysis. 

While other language exist that try to achieve the same goals, we believe
that Pig provides more flexibility and gives more control to the end user. 

SQL typically requires (1) importing data from a user's preferred format
into a database system's internal format (2) well-structured, normalized
data with a declared schema, and (3) programs expressed in declarative
SELECT-FROM-WHERE blocks. In contrast, Pig Latin facilitates (1)
interoperability, i.e. data may be read/written in a format accepted by
other applications such as text editors or graph generators (2) flexibility,
i.e. data may be loosely structured or have structure that is
defined operationally, and (3) adoption by programmers who find procedural
programming more natural than declarative programming.

Sawzall is a scripting language used at Google on top of Map-Reduce. A
sawzall program has a fairly rigid structure consisting of a filtering phase
(the map step) followed by an aggregation phase (the reduce step).
Furthermore, only the filtering phase can be written by the user, and only a
pre-built set of aggregations are available (new ones are non-trivial to
add). While Pig Latin has similar higher level primitives like filtering and
aggregation, an arbitrary number of them can be flexibly chained together in
a Pig Latin program, and all primitives can use user-defined functions with
equal ease. Further, Pig Latin has additional primitives such as cogrouping,
that allow

Re: Board Reports - Missing and Reviews

2007-08-25 Thread Otis Gospodnetic

It's too late now, but it looks like Lucene.Net report is missing, too, no?

George (from Lucene.net), what's new with Lucene.net?  It looks like something 
happened in August, major activity - 
http://mail-archives.apache.org/mod_mbox/incubator-lucene-net-dev/ .

Otis
 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Simpy -- http://www.simpy.com/  -  Tag  -  Search  -  Share

- Original Message 
From: Noel J. Bergman [EMAIL PROTECTED]
To: general@incubator.apache.org
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Saturday, August 25, 2007 6:26:32 PM
Subject: Board Reports - Missing and Reviews

Due to the Board's schedule, there's been some extra time this month, but
time's up.

The Report (http://wiki.apache.org/incubator/August2007) needs to be
completed, and PMC Members should review.

Missing: lokahi, stdcxx, woden, wsrp4j.

--- Noel



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [VOTE] Roller graduation

2007-02-06 Thread Otis Gospodnetic

Finally! +1

Otis

- Original Message 
From: Dave [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Tuesday, February 6, 2007 10:07:52 AM
Subject: [VOTE] Roller graduation

OK, let's try this again.

The Roller community believes that Roller is ready for graduation, as
evidenced by this vote:
http://mail-archives.apache.org/mod_mbox/incubator-roller-dev/200702.mbox/browser

We would like to initiate a vote to graduate to a top level project.
We would like the resolution attached to this email to be presented to
the board for consideration at the next possible board meeting.

For additional information, the Roller status file is here:
http://incubator.apache.org/projects/roller.html


Thanks for your consideration. Please commence voting...

- Dave

Here is the resolution:
https://svn.apache.org/repos/asf/incubator/roller/trunk/tlp-resolution.txt


Establish the Apache Roller project

   WHEREAS, the Board of Directors deems it to be in the best
   interests of the Foundation and consistent with the
   Foundation's purpose to establish a Project Management
   Committee charged with the creation and maintenance of
   open-source software related to the Roller blog server,
   for distribution at no charge to the public.

   NOW, THEREFORE, BE IT RESOLVED, that a Project Management
   Committee (PMC), to be known as the Apache Roller Project,
   be and hereby is established pursuant to Bylaws of the
   Foundation; and be it further

   RESOLVED, that the Apache Roller Project be and hereby is
   responsible for the creation and maintenance of open-source
   software related to the Roller blog server; and be it further

   RESOLVED, that the office of Vice President, Roller be and
   hereby is created, the person holding such office to serve at
   the direction of the Board of Directors as the chair of the
   Apache Roller Project, and to have primary responsibility for
   management of the projects within the scope of responsibility
   of the Apache Roller Project; and be it further

   RESOLVED, that the persons listed immediately below be and
   hereby are appointed to serve as the initial members of the
   Apache Roller Project:

 * Anil Gangolli   [EMAIL PROTECTED]
 * Allen Gilliland [EMAIL PROTECTED]
 * Dave Johnson[EMAIL PROTECTED]
 * Matt Raible [EMAIL PROTECTED]
 * Craig Russell   [EMAIL PROTECTED]
 * Matthew Schmidt [EMAIL PROTECTED]
 * Elias Torres[EMAIL PROTECTED]
 * Henri Yandell   [EMAIL PROTECTED]

   NOW, THEREFORE, BE IT FURTHER RESOLVED, that Dave Johnson
   be appointed to the office of Vice President, Roller, to serve
   in accordance with and subject to the direction of the Board of
   Directors and the Bylaws of the Foundation until death,
   resignation, retirement, removal or disqualification, or until
   a successor is appointed; and be it further

   RESOLVED, that the initial Apache Roller Project be and hereby
   is tasked with the creation of a set of bylaws intended to
   encourage open development and increased participation in the
   Roller Project; and be it further

   RESOLVED, that the initial Apache Roller Project be and hereby
   is tasked with the migration and rationalization of the Apache
   Incubator Roller podling; and be it further

   RESOLVED, that all responsibility pertaining to the Apache
   Incubator Roller podling encumbered upon the Apache Incubator
   PMC are hereafter discharged.

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [VOTE] graduate Solr to Lucene

2007-01-14 Thread Otis Gospodnetic

+1

Otis

- Original Message 
From: Yonik Seeley [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Friday, January 12, 2007 11:19:31 AM
Subject: [VOTE] graduate Solr to Lucene

The Solr community has voted and believes Solr is ready for graduation
from the Incubator and has met all incubation requirements, and the
Lucene PMC has voted to accept Solr.

The Solr podling is therefore requesting to graduate from the
Incubator to become an Apache Lucene subproject.

Please send in your +1/0/-1 to approve/abstain/disapprove.

References:
Lucene PMC Vote on it's private list:
  Message-ID: [EMAIL PROTECTED]

Solr community vote:
http://mail-archives.apache.org/mod_mbox/lucene-solr-dev/200701.mbox/[EMAIL 
PROTECTED]

The project status for Solr is at:
   http://incubator.apache.org/projects/solr.html

The Solr home page is at:
   http://incubator.apache.org/solr/

-Yonik

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: New Name for UIMA Podling?

2006-10-26 Thread Otis Gospodnetic

Hi,

- Original Message 
From: Rodent of Unusual Size [EMAIL PROTECTED]

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Mads Toftum wrote:
 
 +1 - there seems to have started some sort of fascination with changing
 names where there is no need. In general I'm not really a fan of naming
 things so that it is impossible to guess what a project is (that's hard
 enough as it is already).

How would UIMA be pronounced in languages other than English?

OG: How is New York pronounced in languages other than English?  I think 
pronouncing UIMA would follow the same pattern.
My question is: how do you pronounce UIMA in English?  I actually don't 
pronounce it the English way, I pronounce it the Croatian/phonetic way - U (u 
as in [oo]ze) I (i as in [i]diot) M (m as in [m]ama) A (as as in 
[a]nother).

Otis

Aside from that, I'm wary of tedious and uninspiring names,
like 'log4j'.  I'm also wary of retaining names for projects
that have had an existence prior to Apache.  One reason is
possible IP issues, and another is confusion.  If some commercial
concern has a product based on UIMA and they say so.. do they
mean Apache UIMA?  Pre-Apache UIMA?  If they adopt the Apache
package, do we need to worry about brand issues?  (Answer: yes.)

This is a new set of IP attributes for this item.  I seriously think
it needs a new name.. and not least because it's coming from
the company with probably the greatest investment in software IP
on the planet.  With all the (baseless) remarks about Apache
becoming a BigCo shill and clearinghouse, I see contraindications
for maintaining the BigCo name.

Just MHO.
- --
#kenP-)}

Ken Coar, Sanagendamgagwedweinini  http://Ken.Coar.Org/
Author, developer, opinionist  http://Apache-Server.Com/

Millennium hand and shrimp!
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.4 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQCVAwUBRUDcPJrNPMCpn3XdAQLWLgQAnvFImsyLzZnkhIcxOpIn9x/LMjmlMGOq
kRObYnnZFHwU/367BnrcajZyb4ttgviRkfcAbZvAAYHp+FHK2LFRCZPnXijimDYk
jV85hQ0x3cRcwKhquq43ZmNEE9pTmGIj8xqy9oZmw/klgfgCo6DOrPTvKRJne6r8
O5Kk/g3eBGg=
=+Hf1
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: New Name for UIMA Podling?

2006-10-19 Thread Otis Gospodnetic

+1 for UIMA, even if some other ones are cute.
Keeping UIMA makes it easy (for me) to pull just relevant pages from Google, 
Technorati, Simpy, etc. instead of also pulling up pages about cute animals, 
islands, and so on.

Otis

- Original Message 
From: Thilo Goetz [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Wednesday, October 18, 2006 7:27:40 AM
Subject: Re: New Name for UIMA Podling?

If there's no reason for us to change the project name, then I for one 
would just like to keep the one we have.  We have built some name 
recognition around UIMA already, and I hope the Ukrainian Institute of 
Modern Art will forgive us for usurping the #1 spot on Google ;-)

--Thilo

Mads Toftum wrote:
 On Mon, Oct 16, 2006 at 11:58:53PM +0200, Leo Simons wrote:
 Note UIMA is a fine name for an apache project. We have projects like  

 +1 - there seems to have started some sort of fascination with changing
 names where there is no need. In general I'm not really a fan of naming
 things so that it is impossible to guess what a project is (that's hard
 enough as it is already).

 vh

 Mads Toftum

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [VOTE] Mark lucene4c as dormant

2006-10-10 Thread Otis Gospodnetic

  [X] +1 - Mark Lucene4c as dormant.

  [ ]  0 - I have no opinion.

  [ ] -1 - No, please keep it!  [include reason]



Otis





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [Vote] accept UIMA as a podling - #2

2006-09-26 Thread Otis Gospodnetic

[X] +1 Accept UIMA as an Incubator podling
[ ]   0 Don't care
[ ] -1 Reject this proposal for the following reason:

Otis

- Original Message 
From: Ian Holsman [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Tuesday, September 26, 2006 7:17:37 PM
Subject: [Vote] accept UIMA as a podling  - #2

issues addressed in this release:
1. updated proposal included
2. The first paragraph explains it to a layperson
3. OASIS issue addressed

[ ] +1 Accept UIMA as an Incubator podling
[ ]   0 Don't care
[ ] -1 Reject this proposal for the following reason:

8---Proposal--8--

Hello everyone -

We are submitting this proposal to the community for a
new project in the incubator, and look forward to starting to work with
this community.

This is a slightly modified and extended version of the proposal that  
has
already been posted to [EMAIL PROTECTED]  The whole mail  
thread
can be found [http://www.nabble.com/Proposal-for-a-new-incubation- 
project%3A-Unstructured-Information-Management-Architecture---UIMA- 
tf2154324.html here].

If you don't feel like reading the whole thread, the main question  
that came up was:
this is all very well, but what does it really '''do'''?  Attempts to  
answer that question
where made [http://www.nabble.com/Re%3A-Proposal-for-a-new-incubation- 
project%3A-Unstructured-Information-Management-Architecture---UIMA- 
p5986403.html here] and [http://www.nabble.com/Re%3A-Proposal-for-a- 
new-incubation-project%3A-Unstructured-Information-Management- 
Architecture---UIMA-p5987788.html here].  We have since worked some  
of these into the proposal itself.

= Proposal for Incubation Project: Unstructured Information  
Management Architecture - UIMA =

== Abstract ==

UIMA is a component framework for the analysis of unstructured  
content such as text, audio and video.  It comprises an SDK and  
tooling for composing and running analytic components written in Java  
and C++.

== Proposal:  Unstructured Information Management Architecture  
framework ==

Unstructured Information Management applications are software systems  
that analyze large volumes of unstructured information in order to  
discover knowledge that is relevant to an end user.  We propose UIMA,  
a framework and SDK for developing such applications.  An example UIM  
application might ingest plain text and identify entities, such as  
persons, places, organizations; or relations, such as works-for or  
located-at.  UIMA enables such an application to be decomposed into  
components, for example ''language identification'' - ''language  
specific segmentation'' - ''sentence boundary detection'' -  
''entity detection (person/place names etc.)''.  Each component  
must implement interfaces defined by the framework and must provide  
self-describing metadata via XML descriptor files.  The framework  
manages these components and the data flow between them.  Components  
are written in Java or C++; the data that flows between components is  
designed for efficient mapping between these languages.  UIMA  
additionally provides capabilities to wrap components as network  
services, and can scale to very large volumes by replicating  
processing pipelines over a cluster of networked nodes.

This framework has already attracted a following among government,  
commercial, and academic institutions who previously developed  
analysis algorithms, but were unable to easily build on each other's  
works, and who want to be able to evolve their applications by  
independently upgrading parts, as better technology becomes  
available.  Applications built with this framework are being used  
with plain text, audio streams, and image/video streams, identifying  
entities and relations, converting speech to text, translating into  
different languages, and determining properties of images.

The UIMA framework runs components in a flow, passing a common data  
object containing unstructured information (free text, audio, video,  
etc.) through the components.  Each component examines the  
unstructured information and data added by other components, and adds  
data of its own.  The framework mandates a standardized form of the  
data being passed, and a standardized form of the interfaces to the  
components.

We propose a project to develop, implement, support and enhance this  
framework (and, over time, other implementations) that comply with  
the UIMA standard (which has been submitted for standardization work  
within [http://www.oasis-open.org OASIS].  Members of this community  
are encouraged to participate in that effort, as well; OASIS has an  
open approach to granting Technical Committee voting rights to  
members of OASIS, described here: http://www.oasis-open.org/ 
committees/process.php#2.4.

The proposal includes both the framework, as well as tools to  
develop, describe, compose and deploy UIMA-based components and  
applications. The initial work will be based on the UIMA Version 2

Re: [VOTE] accept UIMA as a podling

2006-09-19 Thread Otis Gospodnetic

Excellent, now that this is out of the way, I'm looking forward to an improved 
proposal, so we can vote on it.
Perhaps, if Garrett doesn't mind, you may want to run the improved proposal by 
Garrett first, before sending a new [VOTE] email with inlined proposal to the 
list.

Otis

- Original Message 
From: Garrett Rooney [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Tuesday, September 19, 2006 7:09:34 PM
Subject: Re: [VOTE] accept UIMA as a podling

On 9/19/06, Ian Holsman [EMAIL PROTECTED] wrote:

 Personally I look at some of the enterprise java proposals and have
 no clue about them either
 as i don't track the SOA/WS specs that closely.

Yes, and that's a BAD thing.  If this proposal was for some
j2ee/WS/SOA related monstrosity with 98 different acronyms in the
first paragraph it would be getting exactly the same -1 from me.

-garrett

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [VOTE] accept UIMA as a podling

2006-09-18 Thread Otis Gospodnetic

Damn, and I was going to give it +1.
UIMA folks answered questions about what it is that UIMA really does in emails, 
but yes, making sure it's answered in the proposal (I can't connect to 
wiki.apache.org at the moment to see the final proposal for myself).

Otis

- Original Message 
From: Garrett Rooney [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Monday, September 18, 2006 5:11:13 PM
Subject: Re: [VOTE] accept UIMA as a podling

On 9/18/06, Ian Holsman [EMAIL PROTECTED] wrote:

 [ ] +1 Accept UIMA as an Incubator podling
 [ ]  0 Don't care
 [X] -1 Reject this proposal for the following reason:

I'm sorry, but I have to vote -1 based on my new policy of rejecting
any potential podling that can't explain what it is that they do
within the first paragraph of the proposal.  I'm a fairly intelligent
person, but honestly I have no clue what an architecture and software
framework for creating, discovering, composing and deploying a broad
range of multi-modal analysis capabilities actually is, and I see
little potential for any project that's so bad at selling themselves
to actually grow a useful community.

Additionally, I believe we decided that having the final vote thread
point to a Wiki page was a bad idea.  It would be good to resend this
with the actual proposal content inline so everyone can be sure what
they're actually voting on.

-garrett

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [PROPOSAL] UIMA (Unstructured Information Management Architecture) Framework

2006-09-09 Thread Otis Gospodnetic

Having finally read all the emails related to this proposal, I'm very much for 
this puppy entering ASF and eventually getting it going with Lucene and 
friends.

A few questions.
1. What you are proposing for ASF is the UIMA 2.0 code that currently lives on 
SF, correct?
2. What about the SDK, and could you tell me/us what's in the SDK that is not 
in the SF code? (I'm confused, because your proposal includes references to 
tools for development and design of UIMA components, but doesn't that typically 
live in an SDK?)
3. I'm a bit puzzled why something that sounds like a framework/pipeline for 
hooking up components with pre-defined input/output adapters ends up with with 
a 400 page user guide/book.  Perhaps I should present this as a question.  How 
come?  Or is that user guide for the SDK only?

Otis


- Original Message 
From: Marshall Schor [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Saturday, September 9, 2006 8:00:57 AM
Subject: [PROPOSAL] UIMA (Unstructured Information Management Architecture) 
Framework

Hello everyone,

I'm restarting this thread on the Unstructured Information Management 
Architecture implementation (UIMA) framework, in the hopes of moving 
this along better; this time it also has the prefix [PROPOSAL] which I 
had left out due to over-excitement at doing my first posting to this 
list :-) . 

Please consider this proposal  (on the incubator wiki because it is 
quite long: http://wiki.apache.org/incubator/UimaProposal ), and help us 
move it along toward getting it voted on by the Incubator PMC.

Two important clarifying emails (as well as the whole previous thread) 
can be found here:
 
http://www.nabble.com/Re%3A-Proposal-for-a-new-incubation-project%3A-Unstructured-Information-Management-Architecture---UIMA-p5987788.html
  
and  
http://www.nabble.com/Re%3A-Proposal-for-a-new-incubation-project%3A-Unstructured-Information-Management-Architecture---UIMA-p5986403.html
(These are also hyperlinks in the wiki to these at the end of the first 
small section.)

-Marshall


Leo Simons wrote:
 On Fri, Aug 25, 2006 at 06:04:04PM +0200, Thilo Goetz wrote:
 snip/
   
 I hope this gives you a better idea what UIMA is about
 

 Yep, this and other explanations made it a lot clearer, thanks!

 UIMA sounds ambituous and interesting.

 cheers,

 Leo
Niclas Hedhman wrote:
 On Thursday 24 August 2006 03:21, Marshall Schor wrote:

   
 Proposal for Incubation Project: Unstructured Information Management
 Architecture - UIMA
 

 From going from WTF is this to Hmmm... interesting after Leo's 
 brilliant please clarify (resusable as well) mail.

 I think this is an area that has plenty of potential, possibly with a lot of 
 interested parties in academia at large, I think ASF could be a good 
 community breeding ground.

 I'm in favour of this, but not capable of contributing in any form.


 Cheers
 Niclas
   
Yonik Seeley wrote:
 On 8/26/06, Thilo Goetz [EMAIL PROTECTED] wrote:
  From an application perspective, we have great hopes for a cooperation
 with the Lucene project.

 Great, I think this is something I'd like to get involved in!
 I've been thinking about how Solr integration could work.

 You then also need a search engine that
 can index that extra information and make it available for search.

 Without getting into too much detail here, some info could be
 immediately usable by Lucene based apps (like entity extraction, where
 you can add info via a new field in the document).  Parts-of-speech
 type of stuff is currently more difficult of course.

 -Yonik

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Adding Jeff Roderburg to Lucene.Net as a committer

2006-04-13 Thread Otis Gospodnetic

+1 for Jeff, Lucene.Net needs him! ;)

Otis

- Original Message 
From: George Aroush [EMAIL PROTECTED]
To: general@incubator.apache.org
Cc: Erik Hatcher [EMAIL PROTECTED]; Doug Cutting [EMAIL PROTECTED]; Jeff 
Rodenburg [EMAIL PROTECTED]
Sent: Wednesday, April 12, 2006 11:21:51 PM
Subject: Adding Jeff Roderburg to Lucene.Net as a committer

Hi folks,

I am looking to add Jeff to Lucene.Net as a committer.  Jeff is very active
with Lucene.Net at SourceForge.net and I believe he will be a good addition
to Lucene.Net

This is what I found http://apache.planetmirror.com.au/dev/pmc.html in
regards on how and what need to be done to add Jeff.  Please advice if this
not it.

Jeff: To get you on board, please start here:
http://apache.planetmirror.com.au/dev/new-committers-guide.html -- there are
some paper work which you have to take care of.

Erik, Doug: I don't know if we really need to vote on this, if so, my vote
is +1.

Regards,

-- George

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [VOTE] accept Solr into incubator

2006-01-10 Thread Otis Gospodnetic

This is as clear as day: +1.

Otis

- Original Message 
From: Doug Cutting [EMAIL PROTECTED]
To: general@incubator.apache.org
Sent: Tue 10 Jan 2006 12:21:49 PM EST
Subject: [VOTE] accept Solr into incubator

I propose that we accept the CNET's Solr project into the incubator.

Discussion on this list evidenced broad interest in this project, which 
bodes well for its ability to build a developer community.

The Lucene PMC would be happy to accept Solr as a Lucene sub-project 
once it graduates from the incubator.

The proposal is at:

http://wiki.apache.org/incubator/SolrProposal

+1

Doug

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Derby page updated

2005-03-31 Thread Otis Gospodnetic

Unfortunately, I think sites are not being (r)synced every 4 hours.  I
made changes to the Lucene web site several days ago (ssh to minotaur,
svn up under /www/lucene).

Maybe somebody on infrastructure will know what is happening.  No rush
for Lucene.

Otis


--- David Crossley [EMAIL PROTECTED] wrote:
 Garrett Rooney wrote:
  
  I believe a 'svn up' needs to be run by someone on minutaur
 (perhaps in 
  /www/incubator.apache.org?), then you need to wait for the sync job
 to 
  copy the results over to the current web server (ajax?).
 
 Thanks for clarifying. I just did that 'svn up' so now we need to
 wait for the rsync to ajax. This quarter's report to the board
 for Infrastucture shows that the rsync happens every 4 hours.
 
 --David
 
 -
 To unsubscribe, e-mail: [EMAIL PROTECTED]
 For additional commands, e-mail: [EMAIL PROTECTED]
 
 

-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Ruby Lucene port - directly under Lucene TLP?

2005-03-16 Thread Otis Gospodnetic

Hello,

If an existing TLP, such as Lucene, wants to develop a port, such as a
Ruby port of the Lucene library, can the Lucene PMC invite that port
and its developers under its wings directly, or does the port need to
go through the Incubator?  Note that this port is not an existing
external project, but a brand new port that would be developed under
Lucene from scratch by a group of 4-5 developers.

Thanks,
Otis



-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

44 matches

Mail list logo