Yes, you have a lot of fields but if you want to do faceting and all possible fields you will probably have to index each node in your document as a separate field. You should look at solr, which supports facets out of the box. Keep in mind what type of tokenizer(s) you use as this effects your results when querying.

- Aleksander

On Mon, 17 Nov 2008 13:49:51 +0100, Bapat, Mayur <[EMAIL PROTECTED]> wrote:

My xml format similar to as follows -

<?xml version="1.0" encoding="UTF-8" ?><Attributes><IBA8909
type="float">0.0</IBA8909><IBA8654 type="float">0.0</IBA8654><folder
type="double">8254</folder><inheritedDomain
type="string">true</inheritedDomain><iterationInfo.branchId
type="double">8438</iterationInfo.branchId><Part
type="double">12771</Part><member
type="double">8443</member><IBA8608
type="string">Ver-dir(TOP)</IBA8608><IBA9119
type="float">0.0</IBA9119><genericType
type="string">standard</genericType><name
type="string">A001</name><versionInfo.identifier.versionId
type="string">A</versionInfo.identifier.versionId><thePersistInfo.update
Count
type="double">3</thePersistInfo.updateCount><IBA8929
type="string">Carbon
film</IBA8929><iterationInfo.creator
type="double">10</iterationInfo.creator><iterationInfo.note
type="string">just
classified</iterationInfo.note><IBA8950 type="float">0</IBA8950><IBA8794
type="string">default</IBA8794><defaultTraceCode
type="string">0</defaultTraceCode><IBA8866
type="double">0</IBA8866><IBA8897
type="float">0.0</IBA8897><name
type="string">TRIMMER</name><versionInfo.identifier.versionSortId
type="string">000001</versionInfo.identifier.versionSortId><number
type="string">0000000001</number><iterationInfo.state
type="string">ctrld</iterationInfo.state><IBA8925
type="float">0</IBA8925><IBA8754
type="float">0.0</IBA8754><IBA9097 type="float">0.0</IBA9097><IBA9123
type="float">0.0</IBA9123><thePersistInfo.createStamp
type="datetime">2008-07-16T08:34:20</thePersistInfo.createStamp><state.s
tate
type="string">INWORK</state.state><masterReference
type="double">8436</masterReference><IBA8839
type="float">0.0</IBA8839><thePersistInfo.updateStamp
type="datetime">2008-08-01T10:44:35</thePersistInfo.updateStamp><IBA8776
type="string">True</IBA8776><view type="double">2931</view><IBA8653
type="float">0.0</IBA8653><IBA8802
type="float">0.0</IBA8802><organizationReference
type="double">856</organizationReference><partType
type="string">separable</partType><IBA8786
type="float">0.0</IBA8786><checkoutInfo.state
type="string">c/i</checkoutInfo.state><thePersistInfo.markForDelete
type="double">0</thePersistInfo.markForDelete><IBA8733
type="string">0</IBA8733><state.atGate
type="string">false</state.atGate><iterationInfo.latest
type="string">true</iterationInfo.latest><IBA8800
type="double">0</IBA8800><Part
type="string">IBRTYPE|Part~~WCP|22037|1</Part><IBA9062
type="float">0.0</IBA9062><thePersistInfo.modifyStamp
type="datetime">2008-08-01T10:44:35</thePersistInfo.modifyStamp><iterati
onInfo.predecessor
type="double">8440</iterationInfo.predecessor><IBA9035
type="string">Industry</IBA9035><IBA8806 type="float">0</IBA8806><source
type="string">make</source><state.lifeCycleId
type="double">6193</state.lifeCycleId><IBA8827
type="string">Through.Hole</IBA8827><iterationInfo.modifier
type="double">10</iterationInfo.modifier><IBA8798
type="double">0</IBA8798><IBA8825
type="string">1</IBA8825><IBA8992 type="string">Open
type</IBA8992><IBA8949
type="string">Yes</IBA8949><versionInfo.identifier.versionLevel
type="double">1</versionInfo.identifier.versionLevel><defaultUnit
type="string">ea</defaultUnit><containerReference
type="double">8212</containerReference><endItem
type="string">false</endItem><IBA8796
type="string">1</IBA8796><folderingInfo.cabinet
type="double">8254</folderingInfo.cabinet><domainRef
type="double">8216</domainRef><IBA9060
type="float">0.0</IBA9060><iterationInfo.identifier.iterationId
type="string">2</iterationInfo.identifier.iterationId><IBA8080
type="string">sizeCode</IBA8080><IBA8891
type="float">0.0</IBA8891><IBA8804
type="float">0</IBA8804><IBA9005
type="float">0.0</IBA9005><checkoutInfo.derivedFrom
type="double">8440</checkoutInfo.derivedFrom></Attributes>

and I am interested in finding a document by giving a scope like -
find a document with IBA8949=Yes AND IBA8802=0.0 and
iterationInfo.predecessor=OR.

It looks like for each element from this XML, I need to create a field
for the index document and create a search query accordingly.




-----Original Message-----
From: Aleksander M. Stensby [mailto:[EMAIL PROTECTED]
Sent: Monday, November 17, 2008 1:55 PM
To: java-user@lucene.apache.org
Subject: Re: Scoped Search and Facets generation using Lucene

I think that the closest you get to "scoped" search in your case would
be to use filters. (If you index your paths, or if the documents have
some standarized format, I assume you could just use one field per
element in your document.)

Maybe you could say a bit about you document structure? (If your
question is a XML specific parsing question, the Lucene mailing list
isn't really right place...)

- Aleksander


On Fri, 14 Nov 2008 17:52:59 +0100, Otis Gospodnetic
<[EMAIL PROTECTED]> wrote:

Hi Mayur,

Solr has built-in support for facets.  I don't understand what you
mean by scoped searches.  Could you please give a concrete example?


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch




________________________________
From: "Bapat, Mayur" <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, November 14, 2008 3:04:12 AM
Subject: Scoped Search and Facets generation using Lucene

Hi,

Does Lucene support Scoped Searches? My intention is to index an XML
String and search for a matching element/attribute value from that XML

by specifying scope(path).
Also is there any direct support for Facets building in Lucene?

Regards,
Mayur

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--
Aleksander M. Stensby
Senior software developer
Integrasco A/S
www.integrasco.no

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]





--
Aleksander M. Stensby
Senior software developer
Integrasco A/S
www.integrasco.no

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to