Re: How can I index this?

2012-01-18 Thread ahammad
That would certainly work.

Just as a general thing, how would one go about indexing Sharepoint content
anyway? I heard about the Sharepoint connector for Lucene but I know nothing
about it. Is there a standard best practice method?

Also, what are your thoughts on extending the DIH? Is that recommended?

Thanks for the input :)

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-can-I-index-this-tp3666106p3670392.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: How can I index this?

2012-01-17 Thread ahammad
Perhaps I was a little confusing...

Normally when I have DB access, I do a regular indexing process using DIH.
For these two sources, I do not have direct DB access. I can only view the
two sources like any end-user would.

I do have a java class that can get the information that I need. That class
gets that information (through HTTP requests) and does not have DB access.
That class is currently being used for other purposes but I can take it and
use it for Solr as well. Does that make sense?

Knowing all that, namely the fact that I cannot directly access the DB, and
I can make HTTP requests to get the info, how can I index that info? 

Please let me know if this clarifies what I am trying to do.

Regards

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-can-I-index-this-tp3666106p3666590.html
Sent from the Solr - User mailing list archive at Nabble.com.


How can I index this?

2012-01-17 Thread ahammad
Hello,

I am looking into indexing two data sources. One of those is a standard
website and the other is a Sharepoint site. The problem is that I have no
direct database access. Normally I would just use the DIH and get what I
need from the DB. I do have a java DAO (data access object) class that I am
using to directly to fetch information for a different purpose. 

In cases like this, what would be the best way to index the data? Should I
somehow integrate Nutch as the crawler? Should I write a custom DIH? Can I
use the DAO that I have in conjunction with the DIH?

I am really looking for some recommendations here. I do have a few hacks
that can be done (copy the data in a DB and index with DIH), but I am
interested in the proper way. Any insight will be greatly appreciated.

Cheers

--
View this message in context: 
http://lucene.472066.n3.nabble.com/How-can-I-index-this-tp3666106p3666106.html
Sent from the Solr - User mailing list archive at Nabble.com.


Upgrading from 1.4 to the latest version

2012-01-11 Thread ahammad
I was doing some reading on the new features and whatnot, and I am interested
in upgrading. I have a few questions though:

1) The index seemed to have changed, can I reuse the current index or should
I reindex the data? I read some things about optimizing the index and
whatnot, but I am not clear on that.
2) Will SolrJ still work?
3) Has there been any changes in the config files or the schema files such
that my existing files won't work, or can I simply reuse them?

Thank you.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/Upgrading-from-1-4-to-the-latest-version-tp3651234p3651234.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Question about morelikethis and multiple fields

2010-11-03 Thread ahammad

I don't quite understand what you mean by that. Did you mean TermVector
Components?

Also, I did some more digging and I found some messages on this mailing list
about filtering. From what I understand, using the standard query handler
(solr/select/?q=...) with a qt parameter allows you to filter on the initial
response using the fq parameter. While this is not a perfect solution for my
application, it will greatly reduce any errors that I may get in the data.
However, when I tried fq, all it's doing is filtering on the result set from
the mlt handler, not the initial response. I need to filter on both the
initial response and the result set.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-morelikethis-and-multiple-fields-tp1836778p1837351.html
Sent from the Solr - User mailing list archive at Nabble.com.


Question about morelikethis and multiple fields

2010-11-03 Thread ahammad

Hello,

I'm trying to implement a "Related Articles" feature within my search
application using the mlt handler.

To give you a little background information, my Solr index contains a single
core that is created by merging 10+ other cores. Within this core is my main
data item known as an "article"; however, there are other data items like
"technical documents", "tickets", etc.

When a user opens an article on my web application, I want to show "Related
Articles" based on 2 fields (title and body). I am using SolrJ as a back-end
for this .

The way I'm thinking of doing it is to search on the title of the existing
article, and hope that the first hit is that actual article. This works in
most of the cases, but occasionally it grabs either the wrong article or a
different type of data item altogether (the first hit my be a technical
document, which is totally unrelated to articles). The following is my
query:

?qt=%2Fmlt&mlt.match.include=true&mlt.mindf=1&mlt.mintf=1&mlt.fl=title,body&q=&fq=dataItem:article&debugQuery=true

There is one main thing that I noticed is that this only seems to match on
the "body" field and not the "title" field. I think it's doing what it's
supposed to and I'm not fully grasping the idea of mlt.

So when it does the initial search to find the document against which it
will find related articles, what search handlers would it use? Normally, my
queries are carried out using dismax with some boosting functionality
applied to them. When I use the standard query handler however, with the qt
parameter defining mlt, what happens for the initial search?

Also, if anybody can suggest an alternative implementation to this I would
greatly appreciate it. Like I said, it's entirely possible that I don't
fully understand mlt and it's causing me to implement stuff in a weird way.

Thanks/

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Question-about-morelikethis-and-multiple-fields-tp1836778p1836778.html
Sent from the Solr - User mailing list archive at Nabble.com.


Shards VS Merged Core?

2010-10-20 Thread ahammad

Hello all,

I'm just wondering what the benefits/consequences are of using shards or
merging all the cores into a single core. Personally I have tried both, but
my document set is not large enough that I can actually test performance and
whatnot.

What is a better approach of implementing a search mechanism on multiple
cores (10-15 cores)?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Shards-VS-Merged-Core-tp1738771p1738771.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Matching exact words

2010-08-26 Thread ahammad

Hello Erick,

Thanks for the reply. I am a little confused by this whole stemming thing.
What exactly does it refer to?

Basically, I already have a field which is essentially a collection of many
other fields (done using copyField). This field is a text field. So what
you're saying is to have a duplicate of this field with different properties
such that it does not stem?

When querying, I assume that I will have to explicitly specify which field
to search against...is this correct?

I'm a little rusty on the solr stuff to be honest so please bear with me.

Thanks
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Matching-exact-words-tp1353350p1357027.html
Sent from the Solr - User mailing list archive at Nabble.com.


Matching exact words

2010-08-26 Thread ahammad

Hello,

I have a case where if I search for the word "windows", I get results
containing both "windows" and "window" (and probably other things like
"windowing" etc.). Is there a way to find exact matches only?

The field in which I am searching is a text field, which as I understand
causes this behaviour. I cannot use a string field because it is very
restricted, but what else can be done? I understand there are other types of
text fields that are more strict than the standard field.

Ideally I would like to keep my index the way it is, with the ability to
force exact matches. For example, if I can search "windows -window" or
something like that, that would be great. Or if I can wrap my query in a set
of quotes to tell it to match exactly. I've seen that done before but I
cannot get it to work.

As a reference, here is my query:

q={!boost b=$db v=$qq
defType=$sh}&qq=windows&db=recip(ms(NOW,lastModifiedLong),3.16e-11,1,1)&sh=dismax

To be quite frank, I am not very familiar with this syntax. I am just using
whatever my old coworker left behind. 

Any tips on how to find exact matches or improve the above query will be
greatly appreciated.

Thanks
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Matching-exact-words-tp1353350p1353350.html
Sent from the Solr - User mailing list archive at Nabble.com.


Performance issues when querying on large documents

2010-07-23 Thread ahammad

Hello,

I have an index with lots of different types of documents. One of those
types basically contains extracts of PDF docs. Some of those PDFs can have
1000+ pages, so there would be a lot of stuff to search through.

I am experiencing really terrible performance when querying. My whole index
has about 270k documents, but less than 1000 of those are the PDF extracts.
The slow querying occurs when I search only on those PDF extracts (by
specifying filters), and return 100 results. The 100 results definitely adds
to the issue, but even cutting that down can be slow.

Is there a way to improve querying with such large results? To give an idea,
querying for a single word can take a little over a minute, which isn't
really viable for an application that revolves around searching. For now, I
have limited the results to 20, which makes the query execute in roughly
10-15 seconds. However, I would like to have the option of returning 100
results.

Thanks a lot.

 
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Performance-issues-when-querying-on-large-documents-tp990590p990590.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: multi-valued associated fields

2010-05-12 Thread ahammad

In our deployment, we thought that complications might arise when attempting
to hit the Solr server with addresses of too many cores. For instance, we
have 15+ cores running at the moment. At the worst case, we will have to use
all 15+ addresses of all the cores to search all our data. What we
eventually did was to combine all the cores into a single core, which will
basically give us a more clean solution. You will get the simplicity of
querying one core, but the flexibility of modifying cores separately. 

Basically, we have all the cores indexing separately. We set up a script
that would use the index merge functionality of Solr to combine all the
indexes into a single index accessible through one core. Yes, there will be
some overhead on the server, but I believe that it's a good compromise. In
our case, we have multiple servers at our disposal, so this was not a
problem to implement. It all depends on your data set and the volume of
documents that you will be indexing. 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/multi-valued-associated-fields-tp811883p813419.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: multi-valued associated fields

2010-05-12 Thread ahammad

I had the same problem as you last year, i.e. indexing stuff from different
sources with different characteristics. The way I approached it is by
setting up a multi-core environment, with each core representing one type of
data. Within each core, I had a "data type" sort of field that would define
what kind of data is stored (i.e. in your case, it would be "auto" or "real
estate" etc...).

The advantages of this setup is that it allows you to make changes to
individual cores without affecting anything else. Also, faceting based on
category is achieved by the data type field. You can do searching on
multiple cores like you would on a single core, meaning that all the search
parameters can be applied. Solr will automatically merge all the data into
one result set. Another advantage is if you index frequently, this way will
allow you to index at different times and reduce the overall load. Just a
thought an an approach...
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/multi-valued-associated-fields-tp811883p813275.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Issue with delta import (not finding data in a column)

2010-05-12 Thread ahammad

Hello,

I am not reusing the context object. The remaining part of the code takes in
a "Blob" object, converts it to a FileInputStream, and reads the contents
using PDFBox. It does not deal with anything related to Solr.

The Transformer doesn't even execute the remaining part of the code. It
doesn't get that far. Let me know if you need any more information.

Thanks
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-with-delta-import-not-finding-data-in-a-column-tp788993p812818.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Issue with delta import (not finding data in a column)

2010-05-12 Thread ahammad

Hello,

I was doing some more testing but I could not find a definitive reason for
this behavior. The following is my transformer:

public Map transformRow(Map row, Context
context) {
List> fields = context.getAllEntityFields();

for (Map field : fields) 
{
// Check if this field has blob="true" specified in the
data-config.xml
String blob = field.get("blob");
if ("true".equals(blob))
{
String columnName = field.get("column");
// Get the field's value from the current row
Blob data = (Blob) row.get(columnName);
// Transform the blob and store back into the same column
if (data != null) {
row.put(columnName, process(data));
}
else 
{
log.error("Blob is null.");
}
   }
}

return row;
}

Note: The function "process" is the function that actually takes care of the
whole transformation. 

What I noticed is that the "row" variable only has the ID, probably due to
this:

deltaQuery="select ID from TABLE1 where (LASTMODIFIED >
to_date('${dataimporter.last_index_time}', '-mm-dd HH24:MI:SS'))"

However, even if I change it to a "select * " statement, I get everything
except the column that contains the blob (it is returned as null).

Something tells me that the data-config may be incorrect. I cannot explain
how this works for full-imports and not delta-imports.

I hope that I explained this issue properly. I am really stuck on this. Any
help would be highly appreciated.
--

ahammad wrote:
> 
> I have a Solr core that retrieves data from an Oracle DB. The DB table has
> a few columns, one of which is a Blob that represents a PDF document. In
> order to retrieve the actual content of the PDF file, I wrote a Blob
> transformer that converts the Blob into the PDF file, and subsequently
> reads it using PDFBox. The blob is contained in a DB column called
> DOCUMENT, and the data goes into a Solr field called fileContent, which is
> required.
> 
> This works fine when doing full imports, but it fails for delta imports. I
> debugged my transformer, and it appears that when it attempts to fetch the
> blob stored in the column, it gets nothing back (i.e. null). Because the
> data is essentially null, it cannot retrieve anything, and cannot store
> anything into Solr. As a result, the document does not get imported. I am
> not sure what the problem is, because this only occurs with delta imports.
> 
> Here is my data-config file:
> 
> 
>  user="user" password="pass"/>
> 
>  deltaImportQuery="select * from TABLE1 where ID
> ='${dataimporter.delta.ID}'"
>   deltaQuery="select ID from TABLE1 where (LASTMODIFIED >
> to_date('${dataimporter.last_index_time}', '-mm-dd HH24:MI:SS'))" 
> 
>   transformer="BlobTransformer">
>   
>   
>   
>blob="true"/>
>name="lastModified" />
>   
> 
> 
> 
> 
> 
> Thanks.
> 

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-with-delta-import-not-finding-data-in-a-column-tp788993p812511.html
Sent from the Solr - User mailing list archive at Nabble.com.


Issue with delta import (not finding data in a column)

2010-05-10 Thread ahammad

I have a Solr core that retrieves data from an Oracle DB. The DB table has a
few columns, one of which is a Blob that represents a PDF document. In order
to retrieve the actual content of the PDF file, I wrote a Blob transformer
that converts the Blob into the PDF file, and subsequently reads it using
PDFBox. The blob is contained in a DB column called DOCUMENT, and the data
goes into a Solr field called fileContent, which is required.

This works fine when doing full imports, but it fails for delta imports. I
debugged my transformer, and it appears that when it attempts to fetch the
blob stored in the column, it gets nothing back (i.e. null). Because the
data is essentially null, it cannot retrieve anything, and cannot store
anything into Solr. As a result, the document does not get imported. I am
not sure what the problem is, because this only occurs with delta imports.

Here is my data-config file:
















Thanks.
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Issue-with-delta-import-not-finding-data-in-a-column-tp788993p788993.html
Sent from the Solr - User mailing list archive at Nabble.com.


Adding a prefix to fields

2009-08-20 Thread ahammad

Hello,

Is it possible to add a prefix to the data in a Solr field? For example,
right now, I have a field called "id" that gets data from a DB through the
DataImportHandler. The DB returns a 4-character string like "ag5f". Would it
be possible to add a prefix to the data that is received?

In this specific case, the data relates to articles. So effectively, if the
DB has "ag5f" as an ID, I want it to be stored as "Article_ag5f". Is there a
way to define a prefix of "Article_" for a certain field?

I am aware that this can be done by writing a transformer. I already have 4
transformers handling a multitude of other things, and I would prefer an
alternative...

Thanks
-- 
View this message in context: 
http://www.nabble.com/Adding-a-prefix-to-fields-tp25062226p25062226.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Strange error with shards

2009-08-19 Thread ahammad

Each core has a different database as a datasource, which means that they
have different DB structures and fields. That is why the schemas are
different.

I figured out the cause of this problem. You were right, it was the
uniqueKey field. All of my cores have that field set to "id" but for this
new core, it is set to "threadID". Changing that to id fixed the problem.




Shalin Shekhar Mangar wrote:
> 
> On Tue, Aug 18, 2009 at 9:01 PM, ahammad  wrote:
> 
>> HTTP Status 500 - null java.lang.NullPointerException at
>>
>> org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:437)
>> at
>>
>> The way I created this shard was to copy an existing one, erasing all the
>> data files/folders, and modifying my schema/data-config files. So the
>> core
>> settings are pretty much the same.
>>
> 
> What did you modify in the schema? All the shards should have the same
> schema. That exception can come if the uniqueKey is missing/null.
> 
> If all the shards should have the same schema, then what is the point of
> sharding in the first place? I thought that it was used to combine
> different cores with different index structures...Right now, every core I
> have is unique, and every schema is different...
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Strange-error-with-shards-tp25027486p25043859.html
Sent from the Solr - User mailing list archive at Nabble.com.



Strange error with shards

2009-08-18 Thread ahammad

Hello,

I have been using multicore/shards for the past 5 months or so with no
problems at all. I just added another core to my Solr server, but for some
reason I can never get the shards working when that specific core is
anywhere in the URL (either in the shards list or the base URL).

HTTP Status 500 - null java.lang.NullPointerException at
org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:437)
at
org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:281)
at
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:290)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1330) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:859)
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:574)
at org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1527)
at java.lang.Thread.run(Thread.java:619) 

The way I created this shard was to copy an existing one, erasing all the
data files/folders, and modifying my schema/data-config files. So the core
settings are pretty much the same.

If I try the shard parameter with any of the other 7 cores that I have, it
works fine. It's only when this specific one is in the URL...

Cheers
-- 
View this message in context: 
http://www.nabble.com/Strange-error-with-shards-tp25027486p25027486.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Question regarding merging Solr indexes

2009-08-09 Thread ahammad

Yes, that is exactly what I did.

If I copy that link, I get a 404 error saying that I need a core name in the
URL. If I add the core name in the URL, I get forwarded to the core's admin
panel, and nothing happens. Am I missing something else?


Shalin Shekhar Mangar wrote:
> 
> On Fri, Aug 7, 2009 at 10:45 PM, ahammad  wrote:
> 
>>
>> Hello,
>>
>> I have a MultiCore setup with 3 cores. I am trying to merge the indexes
>> of
>> core1 and core2 into core3. I looked at the wiki but I'm somewhat unclear
>> on
>> what needs to happen.
>>
>> This is what I used:
>>
>>
>> http://localhost:9085/solr/core3/admin/?action=mergeindexes&core=core3&indexDir=/solrHome/core1/data/index&indexDir=/solrHome/core2/data/index&commit=true
>>
>> When I hit this I just go to the admin page for core3. Maybe the way I
>> reference the indexes is incorrect? What path goes there anyway?
>>
> 
> Look at
> http://wiki.apache.org/solr/MergingSolrIndexes#head-0befd0949a54b6399ff926062279afec62deb9ce
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Question-regarding-merging-Solr-indexes-tp24868670p24887460.html
Sent from the Solr - User mailing list archive at Nabble.com.



Question regarding merging Solr indexes

2009-08-07 Thread ahammad

Hello,

I have a MultiCore setup with 3 cores. I am trying to merge the indexes of
core1 and core2 into core3. I looked at the wiki but I'm somewhat unclear on
what needs to happen.

This is what I used:

http://localhost:9085/solr/core3/admin/?action=mergeindexes&core=core3&indexDir=/solrHome/core1/data/index&indexDir=/solrHome/core2/data/index&commit=true

When I hit this I just go to the admin page for core3. Maybe the way I
reference the indexes is incorrect? What path goes there anyway? 

Thanks


-- 
View this message in context: 
http://www.nabble.com/Question-regarding-merging-Solr-indexes-tp24868670p24868670.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Problem with retrieving field from database using DIH

2009-07-31 Thread ahammad

I looked at the DIH debug page to to be honest I'm not sure how to use it
well and get something out of it.

I am using a solr 1.4 nightly from March.

Cheers



Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> you can try going to the DIH debug page. BTW which version of DIH are you
> using?
> 
> On Fri, Jul 31, 2009 at 6:31 PM, ahammad wrote:
>>
>> Hello,
>>
>> I tried it using the debug and verbose parameters in the address bar.
>> This
>> is what appears in the logs:
>>
>> INFO: Starting Full Import
>> Jul 31, 2009 8:54:40 AM org.apache.solr.handler.dataimport.SolrWriter
>> readIndexerProperties
>> INFO: Read dataimport.properties
>> Jul 31, 2009 8:54:40 AM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> SEVERE: Full Import failed
>> java.lang.NullPointerException
>>        at
>> org.apache.solr.handler.dataimport.DebugLogger.peekStack(DebugLogger.java:78)
>>        at
>> org.apache.solr.handler.dataimport.DebugLogger.log(DebugLogger.java:98)
>>        at
>> org.apache.solr.handler.dataimport.SolrWriter.log(SolrWriter.java:248)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:305)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:224)
>>        at
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167)
>>        at
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:316)
>>        at
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:374)
>>        at
>> org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:187)
>>        at
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
>>        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1330)
>>        at
>> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
>>        at
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
>>        at
>> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
>>        at
>> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
>>        at
>> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
>>        at
>> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
>>        at
>> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
>>        at
>> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
>>        at
>> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
>>        at
>> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
>>        at
>> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
>>        at
>> org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
>>        at
>> org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
>>        at
>> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
>>        at
>> org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
>>        at java.lang.Thread.run(Unknown Source)
>> Jul 31, 2009 8:54:40 AM org.apache.solr.update.DirectUpdateHandler2
>> rollback
>> INFO: start rollback
>> Jul 31, 2009 8:54:40 AM org.apache.solr.update.DirectUpdateHandler2
>> rollback
>> INFO: end_rollback
>>
>>
>> It's different than before because this fails right away. Before adding
>> debug/verbose, it would go through all the rows.
>>
>> It is possible that the last modified column may be missing some data in
>> some rows. The import, however, fails for every single row, which is
>> impossible. I am positive that there is data in that column.
>>
>> Any other suggestions?
>>
>> Cheers
>>
>>
>> ahammad wrote:
>>>
>>> Hello all,
>>>
>>> I've been having this issue for a while now. I am indexing a Sybase
>>> database. Everything is fantastic, except that there is 1 column that I
>>> can never get back. I don't have direct database access via Sybase
>>> client,
>>> but I was able to extract the data using some Java code.
>>>
>&g

Re: Problem with retrieving field from database using DIH

2009-07-31 Thread ahammad

Hello,

I tried it using the debug and verbose parameters in the address bar. This
is what appears in the logs:

INFO: Starting Full Import
Jul 31, 2009 8:54:40 AM org.apache.solr.handler.dataimport.SolrWriter
readIndexerProperties
INFO: Read dataimport.properties
Jul 31, 2009 8:54:40 AM org.apache.solr.handler.dataimport.DataImporter
doFullImport
SEVERE: Full Import failed
java.lang.NullPointerException
at
org.apache.solr.handler.dataimport.DebugLogger.peekStack(DebugLogger.java:78)
at 
org.apache.solr.handler.dataimport.DebugLogger.log(DebugLogger.java:98)
at 
org.apache.solr.handler.dataimport.SolrWriter.log(SolrWriter.java:248)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:305)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:224)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:316)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:374)
at
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:187)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1330)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174)
at
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875)
at
org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665)
at
org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528)
at
org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81)
at
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
at java.lang.Thread.run(Unknown Source)
Jul 31, 2009 8:54:40 AM org.apache.solr.update.DirectUpdateHandler2 rollback
INFO: start rollback
Jul 31, 2009 8:54:40 AM org.apache.solr.update.DirectUpdateHandler2 rollback
INFO: end_rollback


It's different than before because this fails right away. Before adding
debug/verbose, it would go through all the rows.

It is possible that the last modified column may be missing some data in
some rows. The import, however, fails for every single row, which is
impossible. I am positive that there is data in that column.

Any other suggestions?

Cheers


ahammad wrote:
> 
> Hello all,
> 
> I've been having this issue for a while now. I am indexing a Sybase
> database. Everything is fantastic, except that there is 1 column that I
> can never get back. I don't have direct database access via Sybase client,
> but I was able to extract the data using some Java code.
> 
> The field is essentially a Last Modified field. In the DB I believe that
> it is of type long. In the Java program that I have, I am able to retrieve
> the data that is in that column and put it in a variable of type Long.
> This is not the case in Solr, however.
> 
> I set the variable in the schema as required to see why the data is never
> stored:
>  required="true"/>
> 
> This is what I get in the Tomcat logs:
> 
> org.apache.solr.common.SolrException: Document [00069391] missing required
> field: lastModified
>   at
> org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:292)
>   at
> org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:59)
>   at
> org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:67)
>   at
> org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:276)
>   at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:373)
>   at
> org.apache.solr.handler.dataimport.DocBuilder.doF

Problem with retrieving field from database using DIH

2009-07-30 Thread ahammad

Hello all,

I've been having this issue for a while now. I am indexing a Sybase
database. Everything is fantastic, except that there is 1 column that I can
never get back. I don't have direct database access via Sybase client, but I
was able to extract the data using some Java code.

The field is essentially a Last Modified field. In the DB I believe that it
is of type long. In the Java program that I have, I am able to retrieve the
data that is in that column and put it in a variable of type Long. This is
not the case in Solr, however.

I set the variable in the schema as required to see why the data is never
stored:


This is what I get in the Tomcat logs:

org.apache.solr.common.SolrException: Document [00069391] missing required
field: lastModified
at
org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:292)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:59)
at 
org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:67)
at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:276)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:373)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:224)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:316)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:374)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:355)

>From what I can gather, it is not finding the data and/or column, and thus
cannot populate the required field. However, the data is there, which I was
able to prove outside of Solr.

Is there a way to generate more descriptive logs for this? I am completely
lost. I hit this problem a few months ago but I was never able to resolve
it. Any help on this will be much appreciated.

BTW, Solr was successful in retrieving data from other columns in the same
table...

Thanks
-- 
View this message in context: 
http://www.nabble.com/Problem-with-retrieving-field-from-database-using-DIH-tp24746530p24746530.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Question about formatting the results returned from Solr

2009-07-30 Thread ahammad

Yes, I get that.

The problem arises when you have multiple authors. How can I know which
first name goes with which user id etc...

Cheers


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> apparently all the dat ais going to one field 'author'
> 
> instead they should be sent to separate fields
> author_fname
> author_lname
> author_email
> 
> so you would get details like
> 
>  John
>  Doe
>  j...@doe.com
> 
> 
> 
> On Wed, Jul 29, 2009 at 7:39 PM, ahammad wrote:
>>
>> Hi all,
>>
>> Not sure how good my title is, but here is a (hopefully) better
>> explanation
>> on what I mean.
>>
>> I am indexing a set of articles from a DB. Each article has an author.
>> The
>> author is saved in then the DB as an author ID, which is a number.
>>
>> There is another table in the DB with more relevant information about the
>> author. Basically it has columns like:
>>
>> id, firstname, lastname, email, userid
>>
>> I set up the DIH so that it returns the userid, and it works fine:
>>
>> 
>>   jdoe
>>   msmith
>> 
>>
>> Would it be possible to return all of the information about the author
>> (first name, ...) as a subset of the results above?
>>
>> Here is what I mean:
>>
>> 
>>   
>>      John
>>      Doe
>>      j...@doe.com
>>   
>>   ...
>> 
>>
>> Something similar to that at least...
>>
>> Not sure how descriptive I was, but any pointers would be highly
>> appreciated.
>>
>> Cheers
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Question-about-formatting-the-results-returned-from-Solr-tp24719831p24719831.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Question-about-formatting-the-results-returned-from-Solr-tp24719831p24737962.html
Sent from the Solr - User mailing list archive at Nabble.com.



Question about formatting the results returned from Solr

2009-07-29 Thread ahammad

Hi all,

Not sure how good my title is, but here is a (hopefully) better explanation
on what I mean.

I am indexing a set of articles from a DB. Each article has an author. The
author is saved in then the DB as an author ID, which is a number.

There is another table in the DB with more relevant information about the
author. Basically it has columns like:

id, firstname, lastname, email, userid

I set up the DIH so that it returns the userid, and it works fine:


   jdoe
   msmith


Would it be possible to return all of the information about the author
(first name, ...) as a subset of the results above?

Here is what I mean:


   
  John
  Doe
  j...@doe.com
   
   ...


Something similar to that at least...

Not sure how descriptive I was, but any pointers would be highly
appreciated.

Cheers

-- 
View this message in context: 
http://www.nabble.com/Question-about-formatting-the-results-returned-from-Solr-tp24719831p24719831.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr MultiCore query

2009-07-17 Thread ahammad

Hello joe_coder,

Are you using the default example docs in your queries?

If so, then I see that the word "ipod" appears in a field called "name". By
default, the default search field (defined in solrconfig.xml) is the field
called "text". This means that when you submit a query without specifying
which field to look for (using the field:query) notation, Solr automatically
assumes that you are looking in the field called "text".

If you change your query to q=name:ipod, you should get the results back.

One way to prevent this is to change your default search field to something
else. Alternatively, if you want to search on multiple fields, you can copy
all those fields to the "text" field and go from there. This can be useful
if for example you had a book library to search through. You may need to
search on title, short summary, description etc simultaneously. You can copy
all those things to the text field and then search on the text field, which
contains all the information that you wanted to search on.


joe_coder wrote:
> 
> Thanks ahammad for the quick reply.
> 
> As suggested, I am trying out multi core way of implementing the search. I
> am trying out the multicore example and getting stuck at an issue. Here is
> what I did and the issue I am facing
> 
> 1) Downloaded 1.4 and started the multicore example using java
> -Dsolr.solr.home=multicore -jar start.jar
> 
> 2) There were 2 files present under example/multicore/exampledocs/ , which
> I
> added to 2 cores respectively. ( Totally 3 docs are present in those 2
> files
> and all have the word 'ipod' in it )
> 
> 3) When I query using
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=*:*I
> get all the 3 results.
> 
> But when I query using
> http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=
> *ipod* , I get no results :(
> 
> What could be the issue ?
> 
> Thanks!
> 
> 
> On Fri, Jul 17, 2009 at 7:20 PM, ahammad  wrote:
> 
>>
>> Hello,
>>
>> I'm not sure what the best way is to do this, but I have done something
>> identical.
>>
>> I have the same requirements, ie several datasources. I also used SolrJ
>> and
>> jsp for this. The way I ended up doing it was to create a multi core
>> environment, one core per datasource. When I do a query across several
>> datasources, I use shards. Solr automatically returns a "hybrid" result
>> set
>> that way, sorted by solr's default scoring.
>>
>> Faceting comes in the picture when you want to show the number of
>> documents
>> per datasource and have the ability to narrow down the result set. The
>> way
>> I
>> did it was to add a field called "dataSource" to all the documents, and
>> injected them with a default value of the data source name (in your case,
>> D1, D2 ...). You can do this by adding this in the schema:
>>
>> > required="true" default="D1"/>
>>
>> When you perform a query across multiple datasources, you will use
>> shards.
>> Here is an example:
>>
>>
>> http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2&q=some
>> query
>>
>> That will search on both cores 1 and 2.
>>
>> To facet on the datasource in order to be able to categorize the result
>> set,
>> you can simply add this snippet to the query:
>>
>> &facet=on&facet.field=dataSource
>>
>> This will return the datasources that are defined with their number of
>> results for the query.
>>
>> Making the facet results clickable in order to narrow down the results
>> can
>> be achieved by adding a filter to the query and filtering to a specific
>> dataSource. I actually ended uo creating a fairly intuitive front-end for
>> my
>> system with faceting, filtering, paging etc all using jsp and SolrJ.
>> SolrJ
>> is powerful enough to handle all of the backend processing.
>>
>> Good luck!
>>
>>
>>
>>
>>
>>
>> joe_coder wrote:
>> >
>> > I missed adding some size related information in the query above.
>> >
>> > D1 and D2 would have close to 1 million records each
>> > D3 would have ~10 million records.
>> >
>> > Thanks!
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534793.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-MultiCore-query-tp24534383p24539215.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr MultiCore query

2009-07-17 Thread ahammad

Hello,

I'm not sure what the best way is to do this, but I have done something
identical.

I have the same requirements, ie several datasources. I also used SolrJ and
jsp for this. The way I ended up doing it was to create a multi core
environment, one core per datasource. When I do a query across several
datasources, I use shards. Solr automatically returns a "hybrid" result set
that way, sorted by solr's default scoring.

Faceting comes in the picture when you want to show the number of documents
per datasource and have the ability to narrow down the result set. The way I
did it was to add a field called "dataSource" to all the documents, and
injected them with a default value of the data source name (in your case,
D1, D2 ...). You can do this by adding this in the schema:

 

When you perform a query across multiple datasources, you will use shards.
Here is an example:

http://localhost:8080/solr/core1/select?shards=localhost:8080/solr/core1,localhost:8080/solr/core2&q=some
query

That will search on both cores 1 and 2.

To facet on the datasource in order to be able to categorize the result set,
you can simply add this snippet to the query:

&facet=on&facet.field=dataSource

This will return the datasources that are defined with their number of
results for the query.

Making the facet results clickable in order to narrow down the results can
be achieved by adding a filter to the query and filtering to a specific
dataSource. I actually ended uo creating a fairly intuitive front-end for my
system with faceting, filtering, paging etc all using jsp and SolrJ. SolrJ
is powerful enough to handle all of the backend processing.

Good luck!






joe_coder wrote:
> 
> I missed adding some size related information in the query above.
> 
> D1 and D2 would have close to 1 million records each
> D3 would have ~10 million records.
> 
> Thanks!
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-MultiCore-query-tp24534383p24534793.html
Sent from the Solr - User mailing list archive at Nabble.com.



Indexing rich documents from websites using ExtractingRequestHandler

2009-07-08 Thread ahammad

Hello,

I can index rich documents like pdf for instance that are on the filesystem.
Can we use ExtractingRequestHandler to index files that are accessible on a
website?

For example, there is a file that can be reached like so:
http://www.sub.myDomain.com/files/pdfdocs/testfile.pdf

How would I go about indexing that file? I tried using the following
combinations. I will put the errors in brackets:

stream.file=http://www.sub.myDomain.com/files/pdfdocs/testfile.pdf (The
filename, directory name, or volume label syntax is incorrect)
stream.file=www.sub.myDomain.com/files/pdfdocs/testfile.pdf (The system
cannot find the path specified)
stream.file=//www.sub.myDomain.com/files/pdfdocs/testfile.pdf (The format of
the specified network name is invalid)
stream.file=sub.myDomain.com/files/pdfdocs/testfile.pdf (The system cannot
find the path specified)
stream.file=//sub.myDomain.com/files/pdfdocs/testfile.pdf (The network path
was not found)

I sort of understand why I get those errors. What are the alternative
methods of doing this? I am guessing that the stream.file attribute doesn't
support web addresses. Is there another attribute that does?
-- 
View this message in context: 
http://www.nabble.com/Indexing--rich-documents-from-websites-using-ExtractingRequestHandler-tp24392809p24392809.html
Sent from the Solr - User mailing list archive at Nabble.com.



Question regarding ExtractingRequestHandler

2009-07-07 Thread ahammad

Hello,

I've recently started using this handler to index MS Word and PDF files.
When I set ext.extract.only=true, I get back all the metadata that is
associated with that file.

If I want to index, I need to set ext.extract.only=false. If I want to index
all that metadata along with the contents, what inputs do I need to pass to
the http request? Do I have to specifically define all the fields in the
schema or can Solr dynamically generate those fields?

Thanks.
-- 
View this message in context: 
http://www.nabble.com/Question-regarding-ExtractingRequestHandler-tp24374393p24374393.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Installing a patch in a solr nightly on Windows

2009-07-02 Thread ahammad

When I go to the source and I input the command, I get:

bash: patch: command not found

Thanks


Koji Sekiguchi-2 wrote:
> 
> ahammad wrote:
>> Thanks for the suggestions:
>>
>> Koji: I am aware of Cygwin. The problem is I am not sure how to do the
>> whole
>> thing. I downloaded a nightly zip file and extracted it to a directory.
>> Where do I put the .patch file? Where do I execute the "patch..." command
>> from? It doesn't work when I do it at the root of the install.
>>
>>   
> It should work at the root of the install:
> 
> $ patch -p0 < SOLR-284.patch
> 
> Do you see an error message? What's error?
> 
> Koji
> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Installing-a-patch-in-a-solr-nightly-on-Windows-tp24273921p24307414.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Installing a patch in a solr nightly on Windows

2009-07-02 Thread ahammad

Thanks for the suggestions:

Koji: I am aware of Cygwin. The problem is I am not sure how to do the whole
thing. I downloaded a nightly zip file and extracted it to a directory.
Where do I put the .patch file? Where do I execute the "patch..." command
from? It doesn't work when I do it at the root of the install.

Michael: I'll take a look at that standalone utility.

Paul: I assume that in order to do it with svn, you need to checkout the
trunk? What do you do after that? Do you have the link to the distributions?
I get "OPTIONS of 'http://svn.apache.org/repos/asf/lucene/solr/trunk': could
not connect to server (http://svn.apache.org)" when I try. Something tells
me that my proxy is blocking the connection. If that is the case, then I
don't think that I can do a checkout. Do you have any other alternatives?

Thanks again for the input.


ahammad wrote:
> 
> Hello,
> 
> I am trying to install a patch for Solr
> (https://issues.apache.org/jira/browse/SOLR-284) but I'm not sure how to
> do it in Windows.
> 
> I have a copy of the nightly build, but I don't know how to proceed. I
> looked at the HowToContribute wiki for patch installation instructions,
> but there are no Windows specific instructions in there.
> 
> Any help would be greatly appreciated.
> 
> Thanks
> 

-- 
View this message in context: 
http://www.nabble.com/Installing-a-patch-in-a-solr-nightly-on-Windows-tp24273921p24306501.html
Sent from the Solr - User mailing list archive at Nabble.com.



Installing a patch in a solr nightly on Windows

2009-06-30 Thread ahammad

Hello,

I am trying to install a patch for Solr
(https://issues.apache.org/jira/browse/SOLR-284) but I'm not sure how to do
it in Windows.

I have a copy of the nightly build, but I don't know how to proceed. I
looked at the HowToContribute wiki for patch installation instructions, but
there are no Windows specific instructions in there.

Any help would be greatly appreciated.

Thanks
-- 
View this message in context: 
http://www.nabble.com/Installing-a-patch-in-a-solr-nightly-on-Windows-tp24273921p24273921.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using SolrJ with multicore/shards

2009-06-02 Thread ahammad

Sorry for the additional message, the disclaimer was missing.

Disclaimer: The code that was used was taken from the following site:
http://e-mats.org/2008/04/using-solrj-a-short-guide-to-getting-started-with-solrj/
. 


ahammad wrote:
> 
> Hello,
> 
> I played around some more with it and I found out that I was pointing my
> constructor to an older class that doesn't have the MultiCore capability.
> 
> This is what I did to set up the shards:
> 
> query.setParam("shards",
> "localhost:8080/solr/core0/,localhost:8080/solr/core1/");
> 
> I do have a new issue with this though. Here is how the results are
> displayed:
> 
>QueryResponse qr = server.query(query);
> 
> SolrDocumentList sdl = qr.getResults();
> 
> System.out.println("Found: " + sdl.getNumFound());
> System.out.println("Start: " + sdl.getStart());
> System.out.println("Max Score: " + sdl.getMaxScore());
> System.out.println("");
> 
> ArrayList> hitsOnPage = new
> ArrayList>();
> 
> for(SolrDocument d : sdl)
> {
>   
> HashMap values = new HashMap Object>();
> 
> for(Iterator> i = d.iterator();
> i.hasNext(); )
> {
> Map.Entry e2 = i.next();
> 
> values.put(e2.getKey(), e2.getValue());
> }
> 
> hitsOnPage.add(values);
>  
> String outputString = new String(  values.get("title") );
> System.out.println(outputString);
> }
> 
> The field "title" is one of the common fields that is shared between the
> two schemas. When I print the results of my query, I get null for
> everything. However, the result of sdl.getNumFound() is correct, so I know
> that both cores are being accessed.
> 
> Is there a difference with how SolrJ handles multicore requests?
> 
> Disclaimer: The code 
> 
> 
> 
> ahammad wrote:
>> 
>> Hello,
>> 
>> I have a MultiCore install of solr with 2 cores with different schemas
>> and such. Querying directly using http request and/or the solr interface
>> works very well for my purposes.
>> 
>> I want to have a proper search interface though, so I have some code that
>> basically acts as a link between the server and the front-end. Basically,
>> depending on the options, the search string is built, and when the search
>> is submitted, that string gets passed as an http request. The code then
>> would parse through the xml to get the information.
>> 
>> This method works with shards because I can add the shards parameter
>> straight into the link that I end up hitting. Although this is currently
>> functional, I was thinking of using SolrJ simply because it is simpler to
>> use and would cut down the amount of code.
>> 
>> The question is, how would I be able to define the shards in my query, so
>> that when I do search, I hit both shards and get mixed results back?
>> Using http requests, it's as simple as adding a shard=core0,core1
>> snippet. What is the equivalent of this in SolrJ?
>> 
>> BTW, I do have some SolrJ code that is able to query and return results,
>> but for a single core. I am currently using CommonsHttpSolrServer for
>> that, not the Embedded one.
>> 
>> Cheers
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23838988.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using SolrJ with multicore/shards

2009-06-02 Thread ahammad

Hello,

I played around some more with it and I found out that I was pointing my
constructor to an older class that doesn't have the MultiCore capability.

This is what I did to set up the shards:

query.setParam("shards",
"localhost:8080/solr/core0/,localhost:8080/solr/core1/");

I do have a new issue with this though. Here is how the results are
displayed:

   QueryResponse qr = server.query(query);

SolrDocumentList sdl = qr.getResults();

System.out.println("Found: " + sdl.getNumFound());
System.out.println("Start: " + sdl.getStart());
System.out.println("Max Score: " + sdl.getMaxScore());
System.out.println("");

ArrayList> hitsOnPage = new
ArrayList>();

for(SolrDocument d : sdl)
{

HashMap values = new HashMap();

for(Iterator> i = d.iterator();
i.hasNext(); )
{
Map.Entry e2 = i.next();

values.put(e2.getKey(), e2.getValue());
}

hitsOnPage.add(values);
 
String outputString = new String(  values.get("title") );
System.out.println(outputString);
}

The field "title" is one of the common fields that is shared between the two
schemas. When I print the results of my query, I get null for everything.
However, the result of sdl.getNumFound() is correct, so I know that both
cores are being accessed.

Is there a difference with how SolrJ handles multicore requests?

Disclaimer: The code 



ahammad wrote:
> 
> Hello,
> 
> I have a MultiCore install of solr with 2 cores with different schemas and
> such. Querying directly using http request and/or the solr interface works
> very well for my purposes.
> 
> I want to have a proper search interface though, so I have some code that
> basically acts as a link between the server and the front-end. Basically,
> depending on the options, the search string is built, and when the search
> is submitted, that string gets passed as an http request. The code then
> would parse through the xml to get the information.
> 
> This method works with shards because I can add the shards parameter
> straight into the link that I end up hitting. Although this is currently
> functional, I was thinking of using SolrJ simply because it is simpler to
> use and would cut down the amount of code.
> 
> The question is, how would I be able to define the shards in my query, so
> that when I do search, I hit both shards and get mixed results back? Using
> http requests, it's as simple as adding a shard=core0,core1 snippet. What
> is the equivalent of this in SolrJ?
> 
> BTW, I do have some SolrJ code that is able to query and return results,
> but for a single core. I am currently using CommonsHttpSolrServer for
> that, not the Embedded one.
> 
> Cheers
> 

-- 
View this message in context: 
http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23838351.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using SolrJ with multicore/shards

2009-06-02 Thread ahammad

I'm still not sure what you meant. I took a look at that class but I haven't
got any idea on how to proceed.

BTW I tried something like this 

query.setParam("shard", "http://localhost:8080/solr/core0/"; ,
"http://localhost:8080/solr/core1/";);

But it doesn't seem to work for me. I tried it with different variations
too, like removing the http://, and combining both cores as a single string.

Could you please clarify your suggestion?

Regards


Otis Gospodnetic wrote:
> 
> 
> You should be able to set any name=value URL parameter pair and send it to
> Solr using SolrJ.  What's the name of that class... MapSolrParams, I
> believe.
> 
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: ahammad 
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, June 2, 2009 11:06:55 AM
>> Subject: Using SolrJ with multicore/shards
>> 
>> 
>> Hello,
>> 
>> I have a MultiCore install of solr with 2 cores with different schemas
>> and
>> such. Querying directly using http request and/or the solr interface
>> works
>> very well for my purposes.
>> 
>> I want to have a proper search interface though, so I have some code that
>> basically acts as a link between the server and the front-end. Basically,
>> depending on the options, the search string is built, and when the search
>> is
>> submitted, that string gets passed as an http request. The code then
>> would
>> parse through the xml to get the information.
>> 
>> This method works with shards because I can add the shards parameter
>> straight into the link that I end up hitting. Although this is currently
>> functional, I was thinking of using SolrJ simply because it is simpler to
>> use and would cut down the amount of code.
>> 
>> The question is, how would I be able to define the shards in my query, so
>> that when I do search, I hit both shards and get mixed results back?
>> Using
>> http requests, it's as simple as adding a shard=core0,core1 snippet. What
>> is
>> the equivalent of this in SolrJ?
>> 
>> BTW, I do have some SolrJ code that is able to query and return results,
>> but
>> for a single core. I am currently using CommonsHttpSolrServer for that,
>> not
>> the Embedded one.
>> 
>> Cheers
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23834518.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23836485.html
Sent from the Solr - User mailing list archive at Nabble.com.



Using SolrJ with multicore/shards

2009-06-02 Thread ahammad

Hello,

I have a MultiCore install of solr with 2 cores with different schemas and
such. Querying directly using http request and/or the solr interface works
very well for my purposes.

I want to have a proper search interface though, so I have some code that
basically acts as a link between the server and the front-end. Basically,
depending on the options, the search string is built, and when the search is
submitted, that string gets passed as an http request. The code then would
parse through the xml to get the information.

This method works with shards because I can add the shards parameter
straight into the link that I end up hitting. Although this is currently
functional, I was thinking of using SolrJ simply because it is simpler to
use and would cut down the amount of code.

The question is, how would I be able to define the shards in my query, so
that when I do search, I hit both shards and get mixed results back? Using
http requests, it's as simple as adding a shard=core0,core1 snippet. What is
the equivalent of this in SolrJ?

BTW, I do have some SolrJ code that is able to query and return results, but
for a single core. I am currently using CommonsHttpSolrServer for that, not
the Embedded one.

Cheers
-- 
View this message in context: 
http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23834518.html
Sent from the Solr - User mailing list archive at Nabble.com.



Question about field types and querying

2009-05-28 Thread ahammad

Hello,

I have a field type of "text" in my collection called "question".

When I query for the word "customer" for example in the "question" field (ie
q=question:customer), the first document with the highest score shows up,
but does not contain the word customer at all.

Instead, it contains the word "customize".

What would be a way around this? I tried changing the type to string instead
of text, but that I wouldn't get any results if I don't have the exact
statement in there...
-- 
View this message in context: 
http://www.nabble.com/Question-about-field-types-and-querying-tp23768061p23768061.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Problems getting up and running.

2009-05-28 Thread ahammad

Hello,

In the solrconfig.xml file, there is a property:

  

${solr.data.dir:./solr/data}

Try setting something else in here and see what happens...I'm not sure how
solr works with Ubuntu, but it's worth a shot...


Tim Haughton wrote:
> 
> OK, I spoke too soon.
> 
> When you tried it on your Mac, did it create the index in the right place?
> Mine is still trying to create it under the webapps directory.
> 
> Cheers,
> 
> Tim
> 
> 2009/5/28 Tim Haughton 
> 
>> 2009/5/28 Koji Sekiguchi 
>>
>>>
>>> Ok.
>>> I've just tried it (the way you quoted above) on my Mac and worked
>>> fine...
>>> Do you see any errors on Tomcat log when starting?
>>>
>>
>> Sussed it. As you would imagine it was the stupidest of things. And
>> probably the *one* thing left out of my description. My solr.xml file had
>> an
>> illegal character at the top of the file. I hadn't noticed it until the
>> error logs pushed me in the right direction. Thanks for the pointer.
>>
>> Cheers,
>>
>> Tim
>>
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Problems-getting-up-and-running.-tp23758840p23761217.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing from DB connection issue

2009-05-27 Thread ahammad

Hello,

I tried your suggestion, and it still gives me the same error.

I'd like to point out again that the same folder/config setup is running on
my machine with no issues, but it gives me that stack trace in the logs on
the server.

When I do the full data import request through the browser, I get this:


−

0
0

−

−

data-config.xml


full-import
idle

−

0:0:1.329
1
0
0
0
2009-05-27 09:42:24

−

This response format is experimental.  It is likely to change in the future.




Refreshing the page usually results in requests to datasource/rows fetched
etc numbers to increase. In my case the request to datasource stays at 1
regardless. Looks like it tries once and fails, then it terminates the
process...

Regards


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> no need to rename .
> 
> On Wed, May 27, 2009 at 6:50 PM, ahammad  wrote:
>>
>> Would I need to rename it or refer to it somewhere? Or can I keep the
>> existing name (apache-solr-dataimporthandler-1.4-dev.jar)?
>>
>> Cheers
>>
>>
>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>
>>> take the trunk dih.jar. use  winzip/winrar or any tool and just delete
>>> all the files other than ClobTransformer.class. put that jar into
>>> solr.home/lib
>>>
>>> On Wed, May 27, 2009 at 6:10 PM, ahammad  wrote:
>>>>
>>>> Hmmm, that's probably a good idea...although it does not explain how my
>>>> current local setup works.
>>>>
>>>> Can you please explain how this is done? I am assuming that I need to
>>>> add
>>>> the class itself to the source of solr 1.3, and then compile the code,
>>>> and
>>>> take the new .war file and put it in Tomcat? If that is correct, where
>>>> in
>>>> the source folders would the ClobTransformer.class file go?
>>>>
>>>> Thanks.
>>>>
>>>>
>>>>
>>>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>>>
>>>>> I guess it is better to copy the ClobTransformer.class  alone and use
>>>>> the old Solr1.3 DIH
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Tue, May 26, 2009 at 11:50 PM, ahammad 
>>>>> wrote:
>>>>>>
>>>>>> I have an update:
>>>>>>
>>>>>> I played around with it some more and it seems like it's being caused
>>>>>> by
>>>>>> the
>>>>>> ClobTransformer. If I remove the 'clob="true"' from the field part in
>>>>>> the
>>>>>> data-config, it works fine.
>>>>>>
>>>>>> The Solr install is a multicore one. I placed the
>>>>>> apache-solr-dataimporthandler-1.4-dev.jar from the nightly builds in
>>>>>> the
>>>>>> {solrHome}/core1/lib directory (I only need it for the first core).
>>>>>> Is
>>>>>> there
>>>>>> something else I need to do for it to work?
>>>>>>
>>>>>> I don't recall doing an additional step when I did this a few weeks
>>>>>> ago
>>>>>> on
>>>>>> my local machine.
>>>>>>
>>>>>> Any help is appreciated.
>>>>>>
>>>>>> Regards
>>>>>>
>>>>>>
>>>>>> ahammad wrote:
>>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> I am tyring to index directly from an Oracle DB. This is what
>>>>>>> appears
>>>>>>> in
>>>>>>> the stack trace:
>>>>>>>
>>>>>>> SEVERE: Full Import failed
>>>>>>> org.apache.solr.handler.dataimport.DataImportHandlerException:
>>>>>>> Unable
>>>>>>> to
>>>>>>> execute query: select * from ARTICLE Processing Document # 1
>>>>>>>       at
>>>>>>> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:186)
>>>>>>>       at
>>>>>>> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:143)
>>>>>>>       at
>>>>>>> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:43)
>>>>>>>       at
>>>>>>> org.apache.solr.handler.dataimport.Sq

Re: Indexing from DB connection issue

2009-05-27 Thread ahammad

Would I need to rename it or refer to it somewhere? Or can I keep the
existing name (apache-solr-dataimporthandler-1.4-dev.jar)?

Cheers


Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> take the trunk dih.jar. use  winzip/winrar or any tool and just delete
> all the files other than ClobTransformer.class. put that jar into
> solr.home/lib
> 
> On Wed, May 27, 2009 at 6:10 PM, ahammad  wrote:
>>
>> Hmmm, that's probably a good idea...although it does not explain how my
>> current local setup works.
>>
>> Can you please explain how this is done? I am assuming that I need to add
>> the class itself to the source of solr 1.3, and then compile the code,
>> and
>> take the new .war file and put it in Tomcat? If that is correct, where in
>> the source folders would the ClobTransformer.class file go?
>>
>> Thanks.
>>
>>
>>
>> Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
>>>
>>> I guess it is better to copy the ClobTransformer.class  alone and use
>>> the old Solr1.3 DIH
>>>
>>>
>>>
>>>
>>>
>>> On Tue, May 26, 2009 at 11:50 PM, ahammad 
>>> wrote:
>>>>
>>>> I have an update:
>>>>
>>>> I played around with it some more and it seems like it's being caused
>>>> by
>>>> the
>>>> ClobTransformer. If I remove the 'clob="true"' from the field part in
>>>> the
>>>> data-config, it works fine.
>>>>
>>>> The Solr install is a multicore one. I placed the
>>>> apache-solr-dataimporthandler-1.4-dev.jar from the nightly builds in
>>>> the
>>>> {solrHome}/core1/lib directory (I only need it for the first core). Is
>>>> there
>>>> something else I need to do for it to work?
>>>>
>>>> I don't recall doing an additional step when I did this a few weeks ago
>>>> on
>>>> my local machine.
>>>>
>>>> Any help is appreciated.
>>>>
>>>> Regards
>>>>
>>>>
>>>> ahammad wrote:
>>>>>
>>>>> Hello all,
>>>>>
>>>>> I am tyring to index directly from an Oracle DB. This is what appears
>>>>> in
>>>>> the stack trace:
>>>>>
>>>>> SEVERE: Full Import failed
>>>>> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable
>>>>> to
>>>>> execute query: select * from ARTICLE Processing Document # 1
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:186)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:143)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:43)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>>>>> Caused by: java.sql.SQLException: Closed Connection
>>>>>       at
>>>>> oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)
>>>>>       at
>>>>> oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:146)
>>>>>       at
>>>>> oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:208)
>>>>>       at
>>>>> oracle.jdbc.driver.PhysicalConnection.createStatement(PhysicalConnection.java:755)
>>>>>       at
>>>>> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDat

Re: Indexing from DB connection issue

2009-05-27 Thread ahammad

Hmmm, that's probably a good idea...although it does not explain how my
current local setup works.

Can you please explain how this is done? I am assuming that I need to add
the class itself to the source of solr 1.3, and then compile the code, and
take the new .war file and put it in Tomcat? If that is correct, where in
the source folders would the ClobTransformer.class file go?

Thanks.



Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> I guess it is better to copy the ClobTransformer.class  alone and use
> the old Solr1.3 DIH
> 
> 
> 
> 
> 
> On Tue, May 26, 2009 at 11:50 PM, ahammad  wrote:
>>
>> I have an update:
>>
>> I played around with it some more and it seems like it's being caused by
>> the
>> ClobTransformer. If I remove the 'clob="true"' from the field part in the
>> data-config, it works fine.
>>
>> The Solr install is a multicore one. I placed the
>> apache-solr-dataimporthandler-1.4-dev.jar from the nightly builds in the
>> {solrHome}/core1/lib directory (I only need it for the first core). Is
>> there
>> something else I need to do for it to work?
>>
>> I don't recall doing an additional step when I did this a few weeks ago
>> on
>> my local machine.
>>
>> Any help is appreciated.
>>
>> Regards
>>
>>
>> ahammad wrote:
>>>
>>> Hello all,
>>>
>>> I am tyring to index directly from an Oracle DB. This is what appears in
>>> the stack trace:
>>>
>>> SEVERE: Full Import failed
>>> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
>>> execute query: select * from ARTICLE Processing Document # 1
>>>       at
>>> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:186)
>>>       at
>>> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:143)
>>>       at
>>> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:43)
>>>       at
>>> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
>>>       at
>>> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
>>>       at
>>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>>>       at
>>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
>>>       at
>>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136)
>>>       at
>>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
>>>       at
>>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
>>>       at
>>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>>> Caused by: java.sql.SQLException: Closed Connection
>>>       at
>>> oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)
>>>       at
>>> oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:146)
>>>       at
>>> oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:208)
>>>       at
>>> oracle.jdbc.driver.PhysicalConnection.createStatement(PhysicalConnection.java:755)
>>>       at
>>> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:174)
>>>       ... 10 more
>>>
>>> Funny thing is, the data import works on my local machine. I moved all
>>> the
>>> config files to another server, and I get this. I reindexed on my local
>>> machine immediately after in order to verify that the DB works, and it
>>> indexes fine.
>>>
>>> Here is my data-config file, just in case:
>>>
>>> 
>>>     >> user="xxx" password="xxx"/>
>>>     
>>>             >> transformer="ClobTransformer">
>>>                               
>>>                               >> clob="true" />
>>>                               
>>>
>>>                               >> query="select ID_A from ARTICLE_AUTHOR
>>> where ID_A='${ARTICLE.ID}'">
>>>                                        >> name="author"  />
>>>                               
>>>
>>>         
>>>     
>>> 
>>>
>>> I am using the 1.3 release version, with the 1.4 DIH jar file for the
>>> Clob
>>> Transformer. What could be causing this?
>>>
>>> Cheers
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Indexing-from-DB-connection-issue-tp23725712p23728596.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Indexing-from-DB-connection-issue-tp23725712p23741712.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multicore Solr not returning expects results from search

2009-05-26 Thread ahammad

I too am using 1.3. They way you specified shards is correct.

For instance, I normally make the request to core0, and in the shards
parameter, I put the addresses of both core0 and core1. I am using Tomcat
though, so that may be different...

Is there anything in the logs that strikes you as odd when you query across
multiple shards?


KennyN wrote:
> 
> Thanks for the reply ahammad, that helps. Are you specifying them both in
> a URL, or in the  name="shards">localhost:8983/solr/core0,localhost:8983/solr/core1
> like I have? 
> 
> I should add that I now have two indices that have different data in them.
> That is to say the ids are unique across both shards and I am still seeing
> this issue...
> 
> I should also note that this is Solr 1.3, I don't think I mentioned that
> before.
> 
> 
> ahammad wrote:
>> 
>> I have a multicore setup as well, and when I query something, I do it
>> through core0, then specify both core0 and core1 ins the "shards"
>> parameter.
>> 
>> However, I don't have identical indicies. The results I get back are
>> basically and addition of both cores' results.
>> 
>> Good luck, please reply to this message if you have it figured out, I am
>> curious to know what's going on.
>> 
>> Regards
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Multicore-Solr-not-returning-expects-results-from-search-tp23623975p23730420.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Multicore Solr not returning expects results from search

2009-05-26 Thread ahammad

I have a multicore setup as well, and when I query something, I do it through
core0, then specify both core0 and core1 ins the "shards" parameter.

However, I don't have identical indicies. The results I get back are
basically and addition of both cores' results.

Good luck, please reply to this message if you have it figured out, I am
curious to know what's going on.

Regards


KennyN wrote:
> 
> I am still trying to figure this out... I am thinking maybe I have the
> shards setup wrong? If I have core0 and core1 with indices, and then I run
> the query on core0, specifying shards of core0 and core1. Is this how I
> should be doing it? Or should I have another core just to specify the
> other shards?
> 

-- 
View this message in context: 
http://www.nabble.com/Multicore-Solr-not-returning-expects-results-from-search-tp23623975p23729379.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing from DB connection issue

2009-05-26 Thread ahammad

I have an update:

I played around with it some more and it seems like it's being caused by the
ClobTransformer. If I remove the 'clob="true"' from the field part in the
data-config, it works fine.

The Solr install is a multicore one. I placed the
apache-solr-dataimporthandler-1.4-dev.jar from the nightly builds in the
{solrHome}/core1/lib directory (I only need it for the first core). Is there
something else I need to do for it to work?

I don't recall doing an additional step when I did this a few weeks ago on
my local machine.

Any help is appreciated.

Regards


ahammad wrote:
> 
> Hello all,
> 
> I am tyring to index directly from an Oracle DB. This is what appears in
> the stack trace:
> 
> SEVERE: Full Import failed
> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
> execute query: select * from ARTICLE Processing Document # 1
>   at
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:186)
>   at
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:143)
>   at
> org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:43)
>   at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
>   at
> org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
>   at
> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>   at
> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
>   at
> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136)
>   at
> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
>   at
> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
>   at
> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
> Caused by: java.sql.SQLException: Closed Connection
>   at
> oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)
>   at
> oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:146)
>   at
> oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:208)
>   at
> oracle.jdbc.driver.PhysicalConnection.createStatement(PhysicalConnection.java:755)
>   at
> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:174)
>   ... 10 more
> 
> Funny thing is, the data import works on my local machine. I moved all the
> config files to another server, and I get this. I reindexed on my local
> machine immediately after in order to verify that the DB works, and it
> indexes fine.
> 
> Here is my data-config file, just in case:
> 
> 
>  user="xxx" password="xxx"/>
> 
>  transformer="ClobTransformer">
>   
>   
>   
> 
>   
> 
>   
>   
> 
> 
> 
> 
> I am using the 1.3 release version, with the 1.4 DIH jar file for the Clob
> Transformer. What could be causing this?
> 
> Cheers
> 

-- 
View this message in context: 
http://www.nabble.com/Indexing-from-DB-connection-issue-tp23725712p23728596.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing from DB connection issue

2009-05-26 Thread ahammad

Hello Erik,

Yes, the drivers are there.

I forgot to mention, this DB indexing was working before on the server when
the DB was using a different schema.

The schema has changed, so I did all my testing on my local machine. When I
saw that it worked fine, I put in the new connection string/user/pass and
tried it on the server...


Erik Hatcher wrote:
> 
> Did you move the Oracle JDBC driver to the other machine also?
> 
>   Erik
> 
> On May 26, 2009, at 11:37 AM, ahammad wrote:
> 
>>
>> Hello all,
>>
>> I am tyring to index directly from an Oracle DB. This is what  
>> appears in the
>> stack trace:
>>
>> SEVERE: Full Import failed
>> org.apache.solr.handler.dataimport.DataImportHandlerException:  
>> Unable to
>> execute query: select * from ARTICLE Processing Document # 1
>>  at
>> org.apache.solr.handler.dataimport.JdbcDataSource 
>> $ResultSetIterator.(JdbcDataSource.java:186)
>>  at
>> org 
>> .apache 
>> .solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java: 
>> 143)
>>  at
>> org 
>> .apache 
>> .solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java: 
>> 43)
>>  at
>> org 
>> .apache 
>> .solr 
>> .handler 
>> .dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
>>  at
>> org 
>> .apache 
>> .solr 
>> .handler 
>> .dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
>>  at
>> org 
>> .apache 
>> .solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>>  at
>> org 
>> .apache 
>> .solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
>>  at
>> org 
>> .apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java: 
>> 136)
>>  at
>> org 
>> .apache 
>> .solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java: 
>> 334)
>>  at
>> org 
>> .apache 
>> .solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
>>  at
>> org.apache.solr.handler.dataimport.DataImporter 
>> $1.run(DataImporter.java:377)
>> Caused by: java.sql.SQLException: Closed Connection
>>  at
>> oracle 
>> .jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)
>>  at
>> oracle 
>> .jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:146)
>>  at
>> oracle 
>> .jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:208)
>>  at
>> oracle 
>> .jdbc 
>> .driver.PhysicalConnection.createStatement(PhysicalConnection.java: 
>> 755)
>>  at
>> org.apache.solr.handler.dataimport.JdbcDataSource 
>> $ResultSetIterator.(JdbcDataSource.java:174)
>>  ... 10 more
>>
>> Funny thing is, the data import works on my local machine. I moved  
>> all the
>> config files to another server, and I get this. I reindexed on my  
>> local
>> machine immediately after in order to verify that the DB works, and it
>> indexes fine.
>>
>> Here is my data-config file, just in case:
>>
>> 
>>> user="xxx" password="xxx"/>
>>
>>> transformer="ClobTransformer">
>>  
>>  
>>  
>>
>>  
>>   
>>  
>>  
>>
>>
>> 
>>
>> I am using the 1.3 release version, with the 1.4 DIH jar file for  
>> the Clob
>> Transformer. What could be causing this?
>>
>> Cheers
>> -- 
>> View this message in context:
>> http://www.nabble.com/Indexing-from-DB-connection-issue-tp23725712p23725712.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Indexing-from-DB-connection-issue-tp23725712p23727121.html
Sent from the Solr - User mailing list archive at Nabble.com.



Indexing from DB connection issue

2009-05-26 Thread ahammad

Hello all,

I am tyring to index directly from an Oracle DB. This is what appears in the
stack trace:

SEVERE: Full Import failed
org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
execute query: select * from ARTICLE Processing Document # 1
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:186)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:143)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:43)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:59)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
Caused by: java.sql.SQLException: Closed Connection
at
oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:112)
at
oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:146)
at
oracle.jdbc.driver.DatabaseError.throwSqlException(DatabaseError.java:208)
at
oracle.jdbc.driver.PhysicalConnection.createStatement(PhysicalConnection.java:755)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:174)
... 10 more

Funny thing is, the data import works on my local machine. I moved all the
config files to another server, and I get this. I reindexed on my local
machine immediately after in order to verify that the DB works, and it
indexes fine.

Here is my data-config file, just in case:










  






I am using the 1.3 release version, with the 1.4 DIH jar file for the Clob
Transformer. What could be causing this?

Cheers
-- 
View this message in context: 
http://www.nabble.com/Indexing-from-DB-connection-issue-tp23725712p23725712.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unique Identifiers

2009-04-30 Thread ahammad

Hello,

How would I go about creating an aggregate entry? Does it go in the
data-config.xml file?

Also, out of curiosity, how can I access the UUIDField variable? It mat be
required for something else.

Cheers


Erik Hatcher wrote:
> 
> 
> On Apr 28, 2009, at 9:49 AM, ahammad wrote:
>> Is it possible for Solr to assign a unique number to every document?
> 
> Solr has a UUIDField that can be used for this.  But...
> 
>> For example, let's say that I am indexing from several databases with
>> different data structures. The first one has a unique field called  
>> artID,
>> and the second database has a unique field called SRNum. If I want  
>> to have
>> an interface that allows me to search both of those data sources, it  
>> makes
>> it easier to have a single field per document that is common to both
>> datasources...maybe something like uniqueDocID or something like that.
>>
>> That field does not exist in the DB. Is it possible for Solr to  
>> create that
>> field and assign a number while it's indexing?
> 
> I recommend an aggregate unique key field, using maybe this scheme:
> 
> -'
> 
>   Erik
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Unique-Identifiers-tp23277538p23318361.html
Sent from the Solr - User mailing list archive at Nabble.com.



Importing data from Sybase

2009-04-28 Thread ahammad

Hello,

I'm trying to index data from a Sybase DB, but when I attempt to do a full
import, it fails. This is in the log:

SEVERE: Full Import failed
org.apache.solr.handler.dataimport.DataImportHandlerException:
java.lang.AbstractMethodError:
com.sybase.jdbc2.jdbc.SybConnection.setHoldability(I)V
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:404)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:221)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:164)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:312)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:370)
at
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:351)
Caused by: java.lang.AbstractMethodError:
com.sybase.jdbc2.jdbc.SybConnection.setHoldability(I)V
at
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:181)
at
org.apache.solr.handler.dataimport.JdbcDataSource$1.call(JdbcDataSource.java:127)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getConnection(JdbcDataSource.java:361)
at
org.apache.solr.handler.dataimport.JdbcDataSource.access$300(JdbcDataSource.java:38)
at
org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:237)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:207)
at
org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:38)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:58)
at
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:73)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:335)


This seems to me like a Sybase issue but I'm unsure. Is Solr designed to be
compatible with Sybase or has it had issues in the past?

My data-config.xml file is pretty much the same as the one I have for an
Oracle DB, except with the appropriate changes made to
driver/url/user/password fields.

Cheers
-- 
View this message in context: 
http://www.nabble.com/Importing-data-from-Sybase-tp23284464p23284464.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Unable to import data from database

2009-04-28 Thread ahammad

Did you define all the fields that you used in schema.xml?



Ci-man wrote:
> 
> I am using MS SQL server and want to index a table.
> I setup my data-config like this:
> 
> 
>  autoCommit="true" 
>   driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" 
>   url="jdbc:sqlserver://localhost:1433;databaseName=MYDB" 
>   user="" password=""/>
> 
> 
> 
> 
> 
> 
>   
> 
> 
> 
> 
> 
> 
> I am unable to load data from database. I always receive 0 document
> fetched:
> 
> 0:0:12.989
> 1
> 0
> 0
> 0
> 2009-04-28 14:37:49
> 
> 
> The query runs in SQL Server query manager and retrieves records. The
> funny thing is, even if I purposefully write a wrong query with
> non-existing tables I get the same response. What am I doing wrong? How
> can I tell whether a query fails or succeeds or if solr is running the
> query in the first place?
> 
> Any help is appreciated.
> Best,
> -Ci 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Unable-to-import-data-from-database-tp23283852p23284381.html
Sent from the Solr - User mailing list archive at Nabble.com.



Unique Identifiers

2009-04-28 Thread ahammad

Hello all,

Is it possible for Solr to assign a unique number to every document?

For example, let's say that I am indexing from several databases with
different data structures. The first one has a unique field called artID,
and the second database has a unique field called SRNum. If I want to have
an interface that allows me to search both of those data sources, it makes
it easier to have a single field per document that is common to both
datasources...maybe something like uniqueDocID or something like that.

That field does not exist in the DB. Is it possible for Solr to create that
field and assign a number while it's indexing?

Cheers
-- 
View this message in context: 
http://www.nabble.com/Unique-Identifiers-tp23277538p23277538.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing from a DB, corrupt Lucene index

2009-04-22 Thread ahammad

Excuse the error in the title. It should say "missing Lucene index"

Cheers


ahammad wrote:
> 
> Hello,
> 
> I finally was able to run a full import on an Oracle database. According
> to the statistics, it looks like it fetched all the rows from the table.
> However, When I go into /data, there is nothing in there.
> 
> This is my data-config.xml file:
> 
> 
>  user="" password=""/>
> 
> 
>   
>   
>   
>   
> 
>   
>   
>
> 
> 
> 
> 
> I added all the relevant fileds in the schema.xml file. From the interface
> when I do dataimport?command=full-import, it says that "n rows were
> fetched", where n is the actual number of rows in the DB table. Everything
> looks great from there, but there is nothing in my data folder. In
> solrconfig.xml, the line that defines the location where data is stored
> is:
> 
> ${solr.data.dir:./solr/data}
> 
> What am I missing exactly? BTW, the Tomcat logs don't show errors or
> anything like that.
> 
> Cheers and Thank you.
> 

-- 
View this message in context: 
http://www.nabble.com/Indexing-from-a-DB%2C-corrupt-Lucene-index-tp23175796p23175805.html
Sent from the Solr - User mailing list archive at Nabble.com.



Indexing from a DB, corrupt Lucene index

2009-04-22 Thread ahammad

Hello,

I finally was able to run a full import on an Oracle database. According to
the statistics, it looks like it fetched all the rows from the table.
However, When I go into /data, there is nothing in there.

This is my data-config.xml file:












 




I added all the relevant fileds in the schema.xml file. From the interface
when I do dataimport?command=full-import, it says that "n rows were
fetched", where n is the actual number of rows in the DB table. Everything
looks great from there, but there is nothing in my data folder. In
solrconfig.xml, the line that defines the location where data is stored is:

${solr.data.dir:./solr/data}

What am I missing exactly? BTW, the Tomcat logs don't show errors or
anything like that.

Cheers and Thank you.
-- 
View this message in context: 
http://www.nabble.com/Indexing-from-a-DB%2C-corrupt-Lucene-index-tp23175796p23175796.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using Solr to index a database

2009-04-21 Thread ahammad

Thanks for the link...

I'm still a bit unclear as to how it goes. For example, lets say i have a
table called PRODUCTS, and within that table, I have the following columns:
NUMBER (product number)
NAME (product name)
PRICE

How would I index all this information? Here is an example (from the links
you provided) of xml that confuses me:

deltaQuery="select id from item where last_modified >
'${dataimporter.last_index_time}'">




What is that deltaQuery (or even if it was a regular "query" expression)
line for? It seems to me like a sort of filter. What if I don't want to
filter anything and just want to index all the rows?

Cheers




Noble Paul നോബിള്‍  नोब्ळ् wrote:
> 
> On Mon, Apr 20, 2009 at 7:15 PM, ahammad  wrote:
>>
>> Hello,
>>
>> I've never used Solr before, but I believe that it will suit my current
>> needs with indexing information from a database.
>>
>> I downloaded and extracted Solr 1.3 to play around with it. I've been
>> looking at the following tutorials:
>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>> http://wiki.apache.org/solr/DataImportHandler
>> http://wiki.apache.org/solr/DataImportHandler
>>
>> There are a few things I don't understand. For example, the IBM article
>> sometimes refers to directories that aren't there, or a little different
>> from what I have in my extracted copy of Solr (ie
>> solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I
>> can,
>> but as soon as I put the following in solrconfig.xml, the whole thing
>> breaks:
>>
>> >  class="org.apache.solr.handler.dataimport.DataImportHandler">
>> 
>>  rss-data-config.xml
>> 
>> 
>>
>> Obviously I replace with my own info...One thing I don't quite get is the
>> data-config.xml file. What exactly is it? I've seen examples of what it
>> contains but since I don't know enough, I couldn't really adjust it. In
>> any
>> case, this is the error I get, which may be because of a misconfigured
>> data-config.xml...
> the data-config.xml describes how to fetch data from various data
> sources and index them into Solr.
> 
> The stacktrace says that your xml is invalid.
> 
> The best bet is to take one of the sample dataconfig xml files and make
> changes.
> 
> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/db/conf/db-data-config.xml?revision=691151&view=markup
> 
> http://svn.apache.org/viewvc/lucene/solr/trunk/example/example-DIH/solr/rss/conf/rss-data-config.xml?revision=691151&view=markup
> 
> 
>>
>> org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
>> occurred while initializing context at
>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
>> at
>> org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:99)
>> at
>> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
>> at
>> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
>> at org.apache.solr.core.SolrCore.(SolrCore.java:571) at
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
>> at
>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)
>> at
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
>> at
>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
>> at
>> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
>> at
>> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
>> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
>> at
>> org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at
>> org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at
>> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
>> org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
>> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)

Re: Solr webinar

2009-04-20 Thread ahammad

Hello Erik,

I'm interested in attending the Webinar. I just have some questions to
verify whether or not I am fit to attend...

1) How will it be carried out? What software or application would I need?
2) Do I have to have any experience or can I attend for the purpose of
learning about Solr?

Thanks for taking time to do this.

Regards


Erik Hatcher wrote:
> 
> (excuse the cross-post)
> 
> I'm presenting a webinar on Solr.  Registration is limited, so sign up  
> soon.  Looking forward to "seeing" some of you there!
> 
> Thanks,
>   Erik
> 
> 
> "Got data? You can build your own Solr-powered Search Engine!"
> 
> Erik Hatcher, Lucene/Solr Committer and author, will show you how you  
> how to use Solr to build an Enterprise Search engine that indexes a  
> variety data sources all in a matter of minutes!
> 
> Thursday, April 30, 2009
> 11:00AM - 12:00PM PDT / 2:00PM - 3:00PM EDT
> 
> Sign up for this free webinar today at
> http://www2.eventsvc.com/lucidimagination/?trk=E1
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-webinar-tp23138157p23138451.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Using Solr to index a database

2009-04-20 Thread ahammad

For now it's unclear, as this is sort of an "experiment" to see how much we
can do with it. I am inclined to use the index within Solr though, simply
for the very powerful querying (the stuff I've seen at least). I am not
exactly sure how much of the querying capabilities I'll require though.

I'll take a look at LuSql and see if it can be used for my purposes. I want
to get Solr working though, because I know that later down the road I'm
going to need it for another project...



Glen Newton wrote:
> 
> You have not indicated how you wish to use the index (inside Solr or not).
> 
> It is possible that LuSql might be an preferable alternative to
> Solr/DataImportHandler, depending on your requirements.
> 
> LuSql: http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql
> 
> Disclaimer: I am the author of LuSql.
> 
> -glen
> 
> 2009/4/20 ahammad :
>>
>> Hello,
>>
>> I've never used Solr before, but I believe that it will suit my current
>> needs with indexing information from a database.
>>
>> I downloaded and extracted Solr 1.3 to play around with it. I've been
>> looking at the following tutorials:
>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>> http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
>> http://wiki.apache.org/solr/DataImportHandler
>> http://wiki.apache.org/solr/DataImportHandler
>>
>> There are a few things I don't understand. For example, the IBM article
>> sometimes refers to directories that aren't there, or a little different
>> from what I have in my extracted copy of Solr (ie
>> solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I
>> can,
>> but as soon as I put the following in solrconfig.xml, the whole thing
>> breaks:
>>
>> >  class="org.apache.solr.handler.dataimport.DataImportHandler">
>> 
>>  rss-data-config.xml
>> 
>> 
>>
>> Obviously I replace with my own info...One thing I don't quite get is the
>> data-config.xml file. What exactly is it? I've seen examples of what it
>> contains but since I don't know enough, I couldn't really adjust it. In
>> any
>> case, this is the error I get, which may be because of a misconfigured
>> data-config.xml...
>>
>> org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
>> occurred while initializing context at
>> org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
>> at
>> org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:99)
>> at
>> org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
>> at
>> org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
>> at org.apache.solr.core.SolrCore.(SolrCore.java:571) at
>> org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
>> at
>> org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
>> at
>> org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)
>> at
>> org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
>> at
>> org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
>> at
>> org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
>> at
>> org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
>> at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
>> at
>> org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at
>> org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at
>> org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
>> org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
>> org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
>> at
>> org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
>> at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
>> at
>> org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
>> org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
>> org.apache.catalina.core.StandardEngine.start(

Using Solr to index a database

2009-04-20 Thread ahammad

Hello,

I've never used Solr before, but I believe that it will suit my current
needs with indexing information from a database.

I downloaded and extracted Solr 1.3 to play around with it. I've been
looking at the following tutorials:
http://www.ibm.com/developerworks/java/library/j-solr-update/index.html
http://www.ibm.com/developerworks/java/library/j-solr-update/index.html 
http://wiki.apache.org/solr/DataImportHandler
http://wiki.apache.org/solr/DataImportHandler 

There are a few things I don't understand. For example, the IBM article
sometimes refers to directories that aren't there, or a little different
from what I have in my extracted copy of Solr (ie
solr-dw/rss/conf/solrconfig.xml). I tried to follow the steps as best I can,
but as soon as I put the following in solrconfig.xml, the whole thing
breaks:



  rss-data-config.xml



Obviously I replace with my own info...One thing I don't quite get is the
data-config.xml file. What exactly is it? I've seen examples of what it
contains but since I don't know enough, I couldn't really adjust it. In any
case, this is the error I get, which may be because of a misconfigured
data-config.xml...

org.apache.solr.handler.dataimport.DataImportHandlerException: Exception
occurred while initializing context at
org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:165)
at
org.apache.solr.handler.dataimport.DataImporter.(DataImporter.java:99)
at
org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:96)
at
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388)
at org.apache.solr.core.SolrCore.(SolrCore.java:571) at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:122)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
at
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at
org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at
org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at
org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at
org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at
org.apache.catalina.core.StandardService.start(StandardService.java:448) at
org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at
org.apache.catalina.startup.Catalina.start(Catalina.java:552) at
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at
java.lang.reflect.Method.invoke(Unknown Source) at
org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at
org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Caused by:
org.xml.sax.SAXParseException: The element type "document" must be
terminated by the matching end-tag "". at
com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(Unknown Source)
at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(Unknown
Source) at
org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:153)


It's unclear to me what I need to be using, as in what directories/files I
need to implement this. Can someone please point me in the right direction?

BTW, I'm using Tomcat 5.5 because the prepackaged Jetty simply doesn't work
for me. It shows that it "started" in the command line, but it hangs, and
doesn't actually work when I try to hit the Solr admin page (page not found
type error). Jetty itself does start but the project doesn't seem to
deploy...

I apologize for the long post and if I didn't provide as much information as
I should. Let me know if you need clarification with anything I said.

Thank you very much.
-- 
View this message in context: 
http://www.nabble.com/Using-Solr-to-index-a-database-tp23136944p23136944.html
Sent from the Solr - User mailing list archive at

Re: Integrating Solr and Nutch

2009-03-02 Thread ahammad

Thanks for your reply Andrzej. I am very interested in learning more about
this and I cannot wait to check it out. Nutch is extremely good on its own,
but I want to know what else can be done with the Nutch/Solr combo.

Cheers


Andrzej Bialecki wrote:
> 
> Tony Wang wrote:
>> I heard Nutch 1.0 will have an easy way to integrate with Solr, but I
>> haven't found any documentation on that yet. anyone?
> 
> Indeed, this integration is already supported in Nutch trunk (soon to be 
> released). Please download a nightly package and test it.
> 
> You will need to reindex your segments using the solrindex command, and 
> change the searcher configuration. See nutch-default.xml for details.
> 
> -- 
> Best regards,
> Andrzej Bialecki <><
>   ___. ___ ___ ___ _ _   __
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Integrating-Solr-and-Nutch-tp22252531p22289675.html
Sent from the Solr - User mailing list archive at Nabble.com.



Integrating Solr and Nutch

2009-02-27 Thread ahammad

Hello,

I'm wondering if it's possible to make Solr use a Nutch index. I used Nutch
to crawl some pages and I now have an index with about 2000 documents. I
want to explore the features of Solr, and since both Nutch and Solr are
based off Lucene, I assume that there is some way to integrate them with one
another.

Has this been implemented?

I am using the latest release versions of Nutch and Solr.

Cheers
-- 
View this message in context: 
http://www.nabble.com/Integrating-Solr-and-Nutch-tp22252531p22252531.html
Sent from the Solr - User mailing list archive at Nabble.com.