semantic Search in Farsi News for more relevant results returned from search engines
Best regard - Forwarded Message - From: Alireza Kh master_kh...@yahoo.com To: u...@uima.apache.org u...@uima.apache.org Sent: Tuesday, August 21, 2012 4:14 PM Subject: I am a graduate student .my name's Ali Raza Khodabakhshi. My thesis title is(semantic Search in Farsi News for more relevant results returned from search engines). I've done the research, I realized that softwares (( solr-nutch-siren-uima) can help me on this, but I doubt it in some aspects. Best regard -1 Whether these applications fully support Persian language? 2-For semantic search engines have another tool to add to the above list? Faithfully yours MSc, Computer Engineer (Software)
Re: semantic Search in Farsi News for more relevant results returned from search engines
Could you detail the specific requirements for fully support Persian language? What are the qualities, aspects, and characteristics that need support, both for indexing of content and processing of queries? -- Jack Krupansky -Original Message- From: Alireza Kh Sent: Saturday, August 25, 2012 6:20 AM To: solr-user@lucene.apache.org Subject: semantic Search in Farsi News for more relevant results returned from search engines Best regard - Forwarded Message - From: Alireza Kh master_kh...@yahoo.com To: u...@uima.apache.org u...@uima.apache.org Sent: Tuesday, August 21, 2012 4:14 PM Subject: I am a graduate student .my name's Ali Raza Khodabakhshi. My thesis title is(semantic Search in Farsi News for more relevant results returned from search engines). I've done the research, I realized that softwares (( solr-nutch-siren-uima) can help me on this, but I doubt it in some aspects. Best regard -1 Whether these applications fully support Persian language? 2-For semantic search engines have another tool to add to the above list? Faithfully yours MSc, Computer Engineer (Software)
RE: Solr-4.0.0-Beta Bug with Load Term Info in Schema Browser
This is bug in Solr 4.0.0-Beta Schema Browser: Load Term Info shows 9682 News, but direct query shows 3577. /solr/core0/select?q=channel:Newsfacet=truefacet.field=channelrows=0 response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=facettrue/str str name=qchannel:News/str str name=facet.fieldchannel/str str name=rows0/str /lst /lst result name=response numFound=3577 start=0/ lst name=facet_counts lst name=facet_queries/ lst name=facet_fields lst name=channel int name=News3577/int int name=Blogs0/int int name=Message Boards0/int int name=Video0/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst /response -Original Message- Sent: August-24-12 11:29 PM To: solr-user@lucene.apache.org Cc: sole-...@lucene.apache.org Subject: RE: Solr-4.0.0-Beta Bug with Load Term Info in Schema Browser Importance: High Any news? CC: Dev -Original Message- Subject: Solr-4.0.0-Beta Bug with Load Term Info in Schema Browser Hi there, Load term Info shows 3650 for a specific term MyTerm, and when I execute query channel:MyTerm it shows 650 documents foundŠ possibly bugŠ it happens after I commit data too, nothing changes; and this field is single-valued non-tokenized string. -Fuad -- Fuad Efendi 416-993-2060 http://www.tokenizer.ca
Re: Solr-4.0.0-Beta Bug with Load Term Info in Schema Browser
If you optimize the index, are the results the same? maybe it is showing counts for deleted docs (i think it does... and this is expected) ryan On Sat, Aug 25, 2012 at 9:57 AM, Fuad Efendi f...@efendi.ca wrote: This is bug in Solr 4.0.0-Beta Schema Browser: Load Term Info shows 9682 News, but direct query shows 3577. /solr/core0/select?q=channel:Newsfacet=truefacet.field=channelrows=0 response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=facettrue/str str name=qchannel:News/str str name=facet.fieldchannel/str str name=rows0/str /lst /lst result name=response numFound=3577 start=0/ lst name=facet_counts lst name=facet_queries/ lst name=facet_fields lst name=channel int name=News3577/int int name=Blogs0/int int name=Message Boards0/int int name=Video0/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst /response -Original Message- Sent: August-24-12 11:29 PM To: solr-user@lucene.apache.org Cc: sole-...@lucene.apache.org Subject: RE: Solr-4.0.0-Beta Bug with Load Term Info in Schema Browser Importance: High Any news? CC: Dev -Original Message- Subject: Solr-4.0.0-Beta Bug with Load Term Info in Schema Browser Hi there, Load term Info shows 3650 for a specific term MyTerm, and when I execute query channel:MyTerm it shows 650 documents foundŠ possibly bugŠ it happens after I commit data too, nothing changes; and this field is single-valued non-tokenized string. -Fuad -- Fuad Efendi 416-993-2060 http://www.tokenizer.ca - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr Score threshold 'reasonably', independent of results returned
It will never return no result because its relative to score in previous result If score0.25*last_score then stop Since score0 and last score is 0 for initial hit it will not stop -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4003247.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr Score threshold 'reasonably', independent of results returned
You are right Mr.Ravish, because this depends on (ranking and search fields) formula, but please allow me to tell you that Solr score can help us to define this document is relevant or not in some cases. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Score-threshold-reasonably-independent-of-results-returned-tp4002312p4003248.html Sent from the Solr - User mailing list archive at Nabble.com.
RecursivePrefixTreeStrategy class not found
According to the document I was reading here: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 First, you must register a spatial field type in the Solr schema.xml file. The instructions in this whole document imply the RecursivePrefixTreeStrategyhttp://wiki.apache.org/solr/RecursivePrefixTreeStrategy based field type used in a geospatial context. fieldType name=geo class=org.apache.solr.spatial.RecursivePrefixTreeFieldType spatialContextFactory=com.spatial4j.core.context.jts.JtsSpatialContextFactory distErrPct=0.025 maxDetailDist=0.001 / I need to set the fieldType to RecursivePrefixTreeStrategyhttp://wiki.apache.org/solr/RecursivePrefixTreeStrategy and of course, I'm getting class not found. I'm using the latest solr 4.0.0-BETA I have a field that I would like to import into solr that is a MULTIPOLYGON For Example: TUVTuvalu MULTIPOLYGON (((179.21322733454343 -8.561290924154292, 179.20240933453334 -8.465417924064994, 179.2183813345482 -8.481890924080346, 179.2251453345545 -8.492217924089957, 179.23109133456006 -8.50491792410179, 179.23228133456115 -8.51841792411436, 179.23149133456042 -8.533499924128407, 179.22831833455746 -8.543426924137648, 179.22236333455191 -8.554145924147633, 179.21322733454343 -8.561290924154292)), ((177.2902543327525 -6.114445921875486, 177.28137233274424 -6.109863921871224, 177.27804533274116 -6.099445921861516, 177.28137233274424 -6.089445921852203, 177.3055273327667 -6.10597292186759, 177.2958093327577 -6.113890921874969, 177.2902543327525 -6.114445921875486)), ((176.30636333183617 -6.288335922037433, 176.29871833182904 -6.285135922034456, 176.29525433182584 -6.274581922024623, 176.30601833183584 -6.260135922011173, 176.31198133184142 -6.28215492203168, 176.30636333183617 -6.288335922037433)), ((178.69580033406152 -7.484163923151129, 178.68885433405507 -7.480835923148035, 178.68878133405497 -7.467572923135677, 178.7017813340671 -7.475208923142787, 178.69580033406152 -7.484163923151129))) Since the LSP was moved into Solr, would there be a different name for the class? (I'm not sure the factory class above can be found yet either) Any help would be much appreciated! This communication (including all attachments) is intended solely for the use of the person(s) to whom it is addressed and should be treated as a confidential AAA communication. If you are not the intended recipient, any use, distribution, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately delete it from your system and notify the originator. Your cooperation is appreciated.
RE: RecursivePrefixTreeStrategy class not found
SORRY! RecursivePrefixTreeFieldType cannot be found! Sent: Saturday, August 25, 2012 6:30 PM To: solr-user@lucene.apache.org Subject: RecursivePrefixTreeStrategy class not found According to the document I was reading here: http://wiki.apache.org/solr/SolrAdaptersForLuceneSpatial4 First, you must register a spatial field type in the Solr schema.xml file. The instructions in this whole document imply the RecursivePrefixTreeStrategyhttp://wiki.apache.org/solr/RecursivePrefixTreeStrategy based field type used in a geospatial context. fieldType name=geo class=org.apache.solr.spatial.RecursivePrefixTreeFieldType spatialContextFactory=com.spatial4j.core.context.jts.JtsSpatialContextFactory distErrPct=0.025 maxDetailDist=0.001 / I need to set the fieldType to RecursivePrefixTreeStrategyhttp://wiki.apache.org/solr/RecursivePrefixTreeStrategy and of course, I'm getting class not found. I'm using the latest solr 4.0.0-BETA I have a field that I would like to import into solr that is a MULTIPOLYGON For Example: TUVTuvalu MULTIPOLYGON (((179.21322733454343 -8.561290924154292, 179.20240933453334 -8.465417924064994, 179.2183813345482 -8.481890924080346, 179.2251453345545 -8.492217924089957, 179.23109133456006 -8.50491792410179, 179.23228133456115 -8.51841792411436, 179.23149133456042 -8.533499924128407, 179.22831833455746 -8.543426924137648, 179.22236333455191 -8.554145924147633, 179.21322733454343 -8.561290924154292)), ((177.2902543327525 -6.114445921875486, 177.28137233274424 -6.109863921871224, 177.27804533274116 -6.099445921861516, 177.28137233274424 -6.089445921852203, 177.3055273327667 -6.10597292186759, 177.2958093327577 -6.113890921874969, 177.2902543327525 -6.114445921875486)), ((176.30636333183617 -6.288335922037433, 176.29871833182904 -6.285135922034456, 176.29525433182584 -6.274581922024623, 176.30601833183584 -6.260135922011173, 176.31198133184142 -6.28215492203168, 176.30636333183617 -6.288335922037433)), ((178.69580033406152 -7.484163923151129, 178.68885433405507 -7.480835923148035, 178.68878133405497 -7.467572923135677, 178.7017813340671 -7.475208923142787, 178.69580033406152 -7.484163923151129))) Since the LSP was moved into Solr, would there be a different name for the class? (I'm not sure the factory class above can be found yet either) Any help would be much appreciated! This communication (including all attachments) is intended solely for the use of the person(s) to whom it is addressed and should be treated as a confidential AAA communication. If you are not the intended recipient, any use, distribution, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately delete it from your system and notify the originator. Your cooperation is appreciated. This communication (including all attachments) is intended solely for the use of the person(s) to whom it is addressed and should be treated as a confidential AAA communication. If you are not the intended recipient, any use, distribution, printing, or copying of this email is strictly prohibited. If you received this email in error, please immediately delete it from your system and notify the originator. Your cooperation is appreciated.
Re: Solr-4.0.0-Beta Bug with Load Term Info in Schema Browser
The index directory will include files which list deleted documents. (I do not remember the suffix.) If you do not like this behavior, you can add 'expunge deletes' to your commit requests. On Sat, Aug 25, 2012 at 10:27 AM, Ryan McKinley ryan...@gmail.com wrote: If you optimize the index, are the results the same? maybe it is showing counts for deleted docs (i think it does... and this is expected) ryan On Sat, Aug 25, 2012 at 9:57 AM, Fuad Efendi f...@efendi.ca wrote: This is bug in Solr 4.0.0-Beta Schema Browser: Load Term Info shows 9682 News, but direct query shows 3577. /solr/core0/select?q=channel:Newsfacet=truefacet.field=channelrows=0 response lst name=responseHeader int name=status0/int int name=QTime1/int lst name=params str name=facettrue/str str name=qchannel:News/str str name=facet.fieldchannel/str str name=rows0/str /lst /lst result name=response numFound=3577 start=0/ lst name=facet_counts lst name=facet_queries/ lst name=facet_fields lst name=channel int name=News3577/int int name=Blogs0/int int name=Message Boards0/int int name=Video0/int /lst /lst lst name=facet_dates/ lst name=facet_ranges/ /lst /response -Original Message- Sent: August-24-12 11:29 PM To: solr-user@lucene.apache.org Cc: sole-...@lucene.apache.org Subject: RE: Solr-4.0.0-Beta Bug with Load Term Info in Schema Browser Importance: High Any news? CC: Dev -Original Message- Subject: Solr-4.0.0-Beta Bug with Load Term Info in Schema Browser Hi there, Load term Info shows 3650 for a specific term MyTerm, and when I execute query channel:MyTerm it shows 650 documents foundŠ possibly bugŠ it happens after I commit data too, nothing changes; and this field is single-valued non-tokenized string. -Fuad -- Fuad Efendi 416-993-2060 http://www.tokenizer.ca - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org -- Lance Norskog goks...@gmail.com
Re: How do I represent a group of customer key/value pairs
There are more advanced ways to embed hierarchy in records. This describes them: http://wiki.apache.org/solr/HierarchicalFaceting (This is a great page, never noticed it.) On Fri, Aug 24, 2012 at 8:12 PM, Sheldon P sporc...@gmail.com wrote: Thanks for the prompt reply Jack. Could you point me towards any code examples of that technique? On Fri, Aug 24, 2012 at 4:31 PM, Jack Krupansky j...@basetechnology.com wrote: The general rule in Solr is simple: denormalize your data. If you have some maps (or tables) and a set of keys (columns) for each map (table), define fields with names like map-name_key-name, such as map1_name, map2_name, map1_field1, map2_field1. Solr has dynamic fields, so you can define map-name_* to have a desired type - if all the keys have the same type. -- Jack Krupansky -Original Message- From: Sheldon P Sent: Friday, August 24, 2012 3:33 PM To: solr-user@lucene.apache.org Subject: How do I represent a group of customer key/value pairs I've just started to learn Solr and I have a question about modeling data in the schema.xml. I'm using SolrJ to interact with my Solr server. It's easy for me to store key/value paris where the key is known. For example, if I have: title=Some book title author=The authors name I can represent that data in the schema.xml file like this: field name=title type=text_general indexed=true stored=true/ field name=author type=text_general indexed=true stored=true/ I also have data that is stored as a Java HashMap, where the keys are unknown: MapString, String map = new HashMapString, String(); map.put(some unknown key, some unknown data); map.put(another unknown key, more unknown data); I would prefer to store that data in Solr without losing its hierarchy. For example: field name=map type=maptype indexed=true stored=true/ field name=some unknown key type=text_general indexed=true stored=true/ field name=another unknown key type=text_general indexed=true stored=true/ /field Then I could search for some unknown key, and receive some unknown data. Is this possible in Solr? What is the best way to store this kind of data? -- Lance Norskog goks...@gmail.com
Re: More debugging DIH - URLDataSource
About XPaths: the XPath engine does a limited range of xpaths. The doc says that your paths are covered. About logs: You only have the RegexTransformer listed. You need to add LogTransformer to the transformer list: http://wiki.apache.org/solr/DataImportHandler#LogTransformer Having xml entity codes in the url string seems right. Can you verify the url that goes to the remote site? Can you read the logs at the remote site? Can you run this code through a proxy and watch the data? On Fri, Aug 24, 2012 at 1:34 PM, Carrie Coy c...@ssww.com wrote: I'm trying to write a DIH to incorporate page view metrics from an XML feed into our index. The DIH makes a single request, and updates 0 documents. I set log level to finest for the entire dataimport section, but I still can't tell what's wrong. I suspect the XPath. http://localhost:8080/solr/core1/admin/dataimport.jsp?handler=/dataimport returns 404. Any suggestions on how I can debug this? * solr-spec 4.0.0.2012.08.06.22.50.47 The XML data: ?xml version='1.0' encoding='UTF-8'? ReportDataResponse Data Rows Row rowKey=P#PRODUCT: BURLAP POTATO SACKS (PACK OF 12) (W4537)#N/A#5516196614 rowActionAvailability=0 0 0 Value columnId=PAGE_NAME comparisonSpecifier=APRODUCT: BURLAP POTATO SACKS (PACK OF 12) (W4537)/Value Value columnId=PAGE_VIEWS comparisonSpecifier=A2388/Value /Row Row rowKey=P#PRODUCT: OPAQUE PONY BEADS 6X9MM (BAG OF 850) (BE9000)#N/A#5521976460 rowActionAvailability=0 0 0 Value columnId=PAGE_NAME comparisonSpecifier=APRODUCT: OPAQUE PONY BEADS 6X9MM (BAG OF 850) (BE9000)/Value Value columnId=PAGE_VIEWS comparisonSpecifier=A1313/Value /Row /Rows /Data /ReportDataResponse My DIH: |dataConfig dataSource name=coremetrics type=URLDataSource encoding=UTF-8 connectionTimeout=5000 readTimeout=1/ document entity name=coremetrics dataSource=coremetrics pk=id url=https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=**amp;username=amp;format=XMLamp;userAuthKey=amp;language=en_USmp;viewID=9475540amp;period_a=M20110930; processor=XPathEntityProcessor stream=true forEach=/ReportDataResponse/Data/Rows/Row logLevel=fine transformer=RegexTransformer field column=part_code name=id xpath=/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_NAME'] regex=/^PRODUCT:.*\((.*?)\)$/ replaceWith=$1/ field column=page_views xpath=/ReportDataResponse/Data/Rows/Row/Value[@columnId='PAGE_VIEWS'] / /entity /document /dataConfig | |||This little test perl script correctly extracts the data:| || |use XML::XPath;| |use XML::XPath::XMLParser;| || |my $xp = XML::XPath-new(filename = 'cm.xml');| |||my $nodeset = $xp-find('/ReportDataResponse/Data/Rows/Row');| |||foreach my $node ($nodeset-get_nodelist) {| |||my $page_name = $node-findvalue('Value[@columnId=PAGE_NAME]');| |my $page_views = $node-findvalue('Value[@columnId=PAGE_VIEWS]');| |$page_name =~ s/^PRODUCT:.*\((.*?)\)$/$1/;| |}| From logs: INFO: Loading DIH Configuration: data-config.xml Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter loadDataConfig INFO: Data Configuration loaded successfully Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute INFO: [ssww] webapp=/solr path=/dataimport params={command=full-import} status=0 QTime=2 Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.DataImporter doFullImport INFO: Starting Full Import Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.SimplePropertiesWriter readIndexerProperties INFO: Read dataimport.properties Aug 24, 2012 3:53:10 PM org.apache.solr.update.DirectUpdateHandler2 deleteAll INFO: [ssww] REMOVING ALL DOCUMENTS FROM INDEX Aug 24, 2012 3:53:10 PM org.apache.solr.handler.dataimport.URLDataSource getData FINE: Accessing URL: https://welcome.coremetrics.com/analyticswebapp/api/1.0/report-data/contentcategory/bypage.ftl?clientId=*username=***format=XMLuserAuthKey=**language=en_USviewID=9475540period_a=M20110930 Aug 24, 2012 3:53:10 PM org.apache.solr.core.SolrCore execute INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0 QTime=0 Aug 24, 2012 3:53:12 PM org.apache.solr.core.SolrCore execute INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0 QTime=1 Aug 24, 2012 3:53:14 PM org.apache.solr.core.SolrCore execute INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0 QTime=1 Aug 24, 2012 3:53:16 PM org.apache.solr.core.SolrCore execute INFO: [ssww] webapp=/solr path=/dataimport params={command=status} status=0 QTime=0 Aug 24, 2012 3:53:18 PM org.apache.solr.core.SolrCore execute INFO: [ssww] webapp=/solr path=/dataimport
Re: Solr - Index Concurrency - Is it possible to have multiple threads write to same index?
A few other things: Support: many of the Solr committers do not like the Embedded server. It does not get much attention, so if you find problems with it you may have to fix them and get someone to review and commit the fixes. I'm not saying they sabotage it, there just is not much interest in making it first-class. Replication: you can replicate from the Embedded server with the old rsync-based replicator. The Java Replication tool requires servlets. If you are Unix-savvy, the rsync tool is fine. Indexing speed: 1) You can use shards to split the index into pieces. This divides the indexing work among the shards. 2) Do not store the giant data. A lot of sites instead archive the datafile and index a link to the file. Giant stored fields cause indexing speed to drop dramatically because stored data is not saved just once: it is copied repeatedly during merging as new documents are added. Index data is also copied around, but this tends to increase sub-linearly since documents share terms. 3) Do not store positions and offsets. These allow you to do phrase queries because they store the position of each word. They take a lot of memory, and have to be copied around during merging. On Thu, Aug 23, 2012 at 1:31 AM, Mikhail Khludnev mkhlud...@griddynamics.com wrote: I know the following drawbacks of EmbServer: - org.apache.solr.client.solrj.request.UpdateRequest.getContentStreams() which is called on handling update request, provides a lot of garbage in memory and bloat it by expensive XML. - org.apache.solr.response.BinaryResponseWriter.getParsedResponse(SolrQueryRequest, SolrQueryResponse) does something like this on response side - it just bloat your heap for me your task is covered by Multiple Cores. Anyway if you are ok with EmbeddedServer let it be. Just be aware of stream updates feature http://wiki.apache.org/solr/ContentStream my average indexing speed estimate is for fairly small docs less than 1K (which are always used for micro-benchmarking). Much analysis is the key argument for invoking updates in multiple threads. What's your CPU stat during indexing? On Thu, Aug 23, 2012 at 7:52 AM, ksu wildcats ksu.wildc...@gmail.comwrote: Thanks for the reply Mikhail. For our needs the speed is more important than flexibility and we have huge text files (ex: blogs / articles ~2 MB size) that needs to be read from our filesystem and then store into the index. We have our app creating separate core per client (dynamically) and there is one instance of EmbeddedSolrServer for each core thats used for adding documents to the index. Each document has about 10 fields and one of the field has ~2MB data stored (stored = true, analyzed=true). Also we have logic built into our webapp to dynamically create the solr config files (solrConfig schema per core - filters/analyzers/handler values can be different for each core) for each core before creating an instance of EmbeddedSolrServer for that core. Another reason to go with EmbeddedSolrServer is to reduce overhead of transporting large data (~2 MB) over http/xml. We use this setup for building our master index which then gets replicated to slave servers using replication scripts provided by solr. We also have solr admin ui integrated into our webapp (using admin jsp handlers from solradmin ui) We have been using this MultiCore setup for more than a year now and so far we havent run into any issues with EmbeddedSolrServer integrated into our webapp. However I am now trying to figure out the impact if we allow multiple threads sending request to EmbeddedSolrServer (same core) for adding docs to index simultaneously. Our understanding was that EmbeddedSolrServer would give us better performance over http solr for our needs. Its quite possible that we might be wrong and http solr would have given us similar/better performance. Also based on documentation from SolrWiki I am assuming that EmbeddedSolrServer API is same as the one used by Http Solr. Said that, can you please tell if there is any specific downside to using EmbeddedSolrServer that could cause issues for us down the line. I am also interested in your below comment about indexing 1 million docs in few mins. Ideally we would like to get to that speed I am assuming this depends on the size of the doc and type of analyzer/tokenizer/filters being used. Correct? Can you please share (or point me to documentation) on how to get this speed for 1 mil docs. - one million is a fairly small amount, in average it should be indexed in few mins. I doubt that you really need to distribute indexing Thanks -K -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Index-Concurrency-Is-it-possible-to-have-multiple-threads-write-to-same-index-tp4002544p4002776.html Sent from the Solr - User mailing list archive at Nabble.com. -- Sincerely yours Mikhail Khludnev Tech Lead Grid Dynamics
Re: How do I represent a group of customer key/value pairs
Thanks Lance. It looks like it's worth investigating. I've already started down the path of using a bean with @Field(map_*) on my HashMap setter. This defect tipped me off on this functionality: https://issues.apache.org/jira/browse/SOLR-1357 This technique provides me with a mechanism to store the HashMap data, but flattens the structure. I'll play with the ideas provided on http://wiki.apache.org/solr/HierarchicalFaceting;. If anyone has some sample code (java + schema.xml) they can point me too that does Hierarchical Faceting I would very much appreciate it. On Sat, Aug 25, 2012 at 6:42 PM, Lance Norskog goks...@gmail.com wrote: There are more advanced ways to embed hierarchy in records. This describes them: http://wiki.apache.org/solr/HierarchicalFaceting (This is a great page, never noticed it.) On Fri, Aug 24, 2012 at 8:12 PM, Sheldon P sporc...@gmail.com wrote: Thanks for the prompt reply Jack. Could you point me towards any code examples of that technique? On Fri, Aug 24, 2012 at 4:31 PM, Jack Krupansky j...@basetechnology.com wrote: The general rule in Solr is simple: denormalize your data. If you have some maps (or tables) and a set of keys (columns) for each map (table), define fields with names like map-name_key-name, such as map1_name, map2_name, map1_field1, map2_field1. Solr has dynamic fields, so you can define map-name_* to have a desired type - if all the keys have the same type. -- Jack Krupansky -Original Message- From: Sheldon P Sent: Friday, August 24, 2012 3:33 PM To: solr-user@lucene.apache.org Subject: How do I represent a group of customer key/value pairs I've just started to learn Solr and I have a question about modeling data in the schema.xml. I'm using SolrJ to interact with my Solr server. It's easy for me to store key/value paris where the key is known. For example, if I have: title=Some book title author=The authors name I can represent that data in the schema.xml file like this: field name=title type=text_general indexed=true stored=true/ field name=author type=text_general indexed=true stored=true/ I also have data that is stored as a Java HashMap, where the keys are unknown: MapString, String map = new HashMapString, String(); map.put(some unknown key, some unknown data); map.put(another unknown key, more unknown data); I would prefer to store that data in Solr without losing its hierarchy. For example: field name=map type=maptype indexed=true stored=true/ field name=some unknown key type=text_general indexed=true stored=true/ field name=another unknown key type=text_general indexed=true stored=true/ /field Then I could search for some unknown key, and receive some unknown data. Is this possible in Solr? What is the best way to store this kind of data? -- Lance Norskog goks...@gmail.com