On 7/23/03 12:42 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]>
wrote:
> 
> Intriguing idea.  How broad do you see this being?  POI plus the long-lost
> sibling projects (like the Cocoon serializers)?  Other Apache projects
> (Batik?)  Concentrating on MS Office formats, or would this be  a place
> for other Java-based office-like parsers (OpenOffice, SmartSuite, Corel)
> to hang out?   Is there any synergy by having all these projects under one
> umbrella, such as internal reuse?  Obviously, we can share core code for
> CDF.  Anything else?  Can you paint a mental picture of what you would
> want this to look like in 18 months?
> 
> -Rob

Our bold vision for Batik or siblings kind of needs to be gentle.  Apache
politics are something of a quagmire.  Theoretically, the board would like
jakarta.apache.org to disappear into thin air.  They never want to create a
project based on a language again and want all the subprojects to become top
level.

The POI project has held together and grown through tight scope.  We're
loosening on this a little ATM with the TNEF issue.  TNEF seems like it
should be part of POI even though its not OLE 2 Compound Document based.  I
suspect there will be other file formats that will come around and "smell"
like POI but not really fit right in.  That being said I want TNEF in here
and now, not after we talk organization politics so pragmatism outweighs the
issue.

I am of the feeling that file formats should be free.  Meaning if I author a
document in Lotus, I ought to be able to get at that data and munge it into
any format I like.  Its my data, give it to me now as I like it.  I think
most folks on this project are of that opinion and of the opinion that
twisting bits and wading through hex dumps in search of the golden nugget is
FUN!  Secondly, hey, we're all in it for the money at some point.  The POI
developers are the best developers I've ever had the privilege of working
with.

Batik and other technologies often tie the encoding too close to the target
encoding.  Meaning there is no separation between the XML parsing part and
the binary encoding part of Batik in some places.  That�s a problem for many
reasons.  The biggest is repurposing and flexibility.

My vision for POI is that it should keep its tight scope.  POI should be
focused on OLE 2 Compound Document formats.  Nothing more and nothing less.
I see the XML stuff living here because that�s where the people working on
it live.  (The XML stuff which is directly related to POI APIs)

My vision for fileformats.apache.org is that there are many other things and
formats that aren't OLE 2 Compound Document based that should also be free
and debugged.  These are *other productivity suites*, graphics formats, etc.
POI can't swallow non-OLE 2 CDF file formats, they don't fit with the rest
of our code base, the structures are different, they look like bit hanging
off.  POI is part of fileformats.apache.org.  Its a model for other
projects.

CDF (fileformats.apache.org/cdf) would *use* POI and potentially XHSSF or
XHWPF and contain Java APIs for manipulating it, DTDs, Schemas, etc.  Whats
more is that I'd like to see code in C, C#, Java, C++ (yuck), whatever.

What I'd kind of like to do for the ApacheCon is pull our ranks.  Come up
with kind of a mission and consensus on direction.  Show kind of a game
plan.  Its a great chance to not only tell people about what we're doing at
POI, why Open Source works so well for this, but rally the troops and get
them excited about a new effort.  Its also a chance for us to all meet and
talk shop, find whiteboards and see the whites of each others eyes.

In 18 months:

FileFormats.Apache.Org - founded, PMC, etc.

I POI - Focus on OLE 2 CDF file formats
  a. POIFS - memory mapping, random access support added
  b. HSSF - memory mapping, random access, tighter memory model, image
support, formulas finished, graphing support finished, people whining for
pivot tables (maybe me getting a client to fund that ;-) )... Details
filling out, syntax candy
  a. HWPF - Reading writing documents, memory map, Random Access...
  b. HPSF - Write support added
  c. XHSSF - Serializing XHSSF format to XLS and Generating XSSF from XLS
  d. ???  - Your commonly used OLE 2 CDF based file format here.
II TNEF - or possible a mail encodings if its not big enough
III CDF 
  a. APIs (Java, C??) for Reading CDF format, Writing CDF format
  b. HWPF plugin for reading DOC as CDF or writing to DOC with CDF
  c. OOo plugin...
IV CSSF - Common Spreadsheet format
  a. APIs (Java, C) for reading Common Spreadsheet Format...
  b. HSSF plubin 
  c. Ooo plugin
  d. Gnumeric plugin
V Lotus???  
VI image formats...
VII


I even question whether plugins need to be written in multiple languages to
support multiple APIs... The Gnu Java Compiler might let us write one plugin
in Java and have it plugin to multiple places.

The synergies will find themselves really...  I anticipate that the PMC will
have like all of the current POI committers and perhaps folks from projects
like Batik...  

I know we have a bit of a manpower issue on POI ATM.  I mean this project
isn't like other projects where any bozo behind an IDE and a Java compiler
can do it...  (I tend to try and encourage people otherwise)  It takes a
twisted kind of person..  Right now we attract maybe 1 person to be
particularly active every three months and have one burn out every six
months...  We're starting to get a new breed of folks who commit patches now
and again and folks who put in contrib modules. . . I think CDF and some of
that could appeal to a wider audience (because they can dream up XML tag
languages...bores the hell out of me but most people seem to like it) and we
can suck people into the depths of the project by refusing to do things when
they need them and saying "you do it"...  (Andy's dirty trick #1... ;-) )
Besides, as I land more paid work for existing people I think others will
come in.  (My personal goal is to get all of core guys to where they can
afford to devote more time to it)

Anyhow, I haven't thought too much about this...  I'm more interested in
what everyone else has to say about it...  I can bla bla all day if you ask
the wrong questions ;-)

So what does everyone else think?

-Andy
-- 
Andrew C. Oliver
http://www.superlinksoftware.com/poi.jsp
Custom enhancements and Commercial Implementation for Jakarta POI

http://jakarta.apache.org/poi
For Java and Excel, Got POI?


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to