Hi, Wojtek.
It's great to hear your interest in this GSoC project. Your success in
Tuscany CORBA binding project from GSoC 2008 is really encouraging.
Your understanding pretty much matches what I have in mind. A few more
comments.
1) Indexing: I think indexing is probably not only just keyword based. It
will involve the "QName" indexing of the artifacts (such as QName of java
classes, QName of composites, WSDLs, XSDs, BPEL files). The runtime
processing of SCA contributions can also benefit from this work. For
example, the Tuscany already lazily load the WSDL/XSD files upon the need to
resolve references by QName. We should apply the same strategy for composite
files too.
2) The search can be based on keywords, structural URIs, QName of various
artifacts, Policy settings, etc.
3) The search capability could be potentially integrated with the management
of the SCA domain.
Thanks,
Raymond
--------------------------------------------------
From: "Wojtek Janiszewski" <[email protected]>
Sent: Monday, March 30, 2009 2:19 PM
To: <[email protected]>
Subject: [GSoC 2009] Search in SCA domain manager web app
Hi,
I'm interested in taking part in Google Summer of Code and project
"tuscany-scadomain-search" [1] sounds interesting to me.
I've made a quick look inside domain manager web app and Apache Lucene and
made few assumptions for a start. I defined three main areas which project
should cover and they are indexing, searching and presentation. Having
those areas separeted allows us to write modular code and test it.
1. Indexing
- Indexing should include all available contributions. File names as well
as their contents (except non readable files like Java classes) should be
indexed. Every indexed item should have link to its contribution parent.
- After adding, updating or deleting contribution from domain manager web
application appropriate items should be reindexed.
- We may also consider having connections between indexed items, ie. we
could scan composite files to acquire children names and build reversed
links, so every indexed item (script, Java class etc.) could have
connection to its composite parents.
2. Searching
- Search feature would be accessible via SCA domain manager web
application. It should allow to:
-- simply search for files by name
-- search files content
-- filter - search inside specified contribution or composite
- Maybe we should consider candies like Ajax hints while typing search
phrase?
- More research one Apache Lucene could provide more searching ideas.
3. Presentation
- Each search result should be presented using name and link to
contribution which it belongs to. If it's viewable (it's not Java class
etc) then simple preview feature for such item should be enabled.
Obviously matched text should be highlighted (as Google does).
- If information about composite parents for this items would be
accessible then such composites also should be listed.
This quick draft is direction which I'll take while creating proposal. It
appears to be interesting project, especially it allows to explore new
areas (everything beyond bindings in Tuscany, Lucene). There is still much
place to improve (like other features) so any comments are welcome.
Thanks,
Wojtek
[1] -
http://wiki.apache.org/general/SummerOfCode2009#tuscany-scadomain-search