commit, concurrency, full text search

2007-09-17 Thread Dilip.TS
Hi,

1)How does the commit works with multiple requests?
2)Does SOLR handle the concurrency during updates?
3)Does solr support any thing like, if I enclose the keywords within quotes,
then we are searching for exactly those keywords together. Some thing like
google does, for example if I enclose like this java programming  then it
should search for this keyword as a whole instead breaking the phrase apart.

Any help would be highly appreciated.

Thanks in advance.

Regards,
Dilip



largish test data set?

2007-09-17 Thread David Welton
Hi,

I'm in the process of evaluating solr and sphinx, and have come to
realize that actually having a large data set to run them against
would be handy.  However, I'm pretty new to both systems, so thought
that perhaps asking around my produce something useful.

What *I* mean by largish is something that won't fit into memory - say
5 or 6 gigs, which is probably puny for some and huge for others.

BTW, I would also welcome any input from others who have done the
above comparison, although what we'll be using it for is specific
enough that of course I'll need to do my own testing.

Thanks!
-- 
David N. Welton
http://www.welton.it/davidw/


solr locked itself out

2007-09-17 Thread vanderkerkoff

Hello everyone.

I've been reading some posts on this forum and I thought it best to start my
own post as our situation is different from evveryone elses, isn't it always
:-)

We've got a django powered website that has solr as it's search engine.

We're using the example solr application and starting the java at boot time
with 

java -jar start.jar in the example directory

We've had no problem at all until this morning when I started getting an
error saying that solr was locked.

I checked the /tmp directory and in there was a file called
lucene-75248553b96c7f175a8217320c9b8471-write.lock

It's not a very busy website at all and doesn't have alot of data in it, can
someone get me started on how to make sure this doesn't happen again?

some more information

ulimit is unlimited and cat /proc/sys/fs/file-max 11769

in the /tmp directory are 18 directories all called Jetty_8983__solr and 17
of them have numbers at the end of the directory name.

Sorry I'm such a newbie at this, but any help will be greatly appreciated.
-- 
View this message in context: 
http://www.nabble.com/solr-locked-itself-out-tf4466377.html#a12734891
Sent from the Solr - User mailing list archive at Nabble.com.



Re: solr locked itself out

2007-09-17 Thread Ryan McKinley

vanderkerkoff wrote:

I found another post that suggested editing the unlockonstartup value in
solrconfig.xml.

Is that a wise idea?



If you only have a single solr instance at at time, it should be totally 
fine.


Re: Can we build complex filter queries in SOLR

2007-09-17 Thread Alessandro Ferrucci
yeah that is possible, I just tried on one of my solr instances..let's say
you have an index of player names:

(first-name:Tim AND last-name:Anderson) OR (first-name:Anwar AND
last-name:Johnson) OR (conference:Mountain West)

will give you the results that logically match this query..

HTH.

Alessandro Ferrucci :)

On 9/17/07, Dilip.TS [EMAIL PROTECTED] wrote:

 Hi,

 I would like to know if we can build a complex filter queryString in SOLR
 using the following condition.
  (Field1 = abc AND Field2 = def) OR (Field3 = abcd AND
 Field4
 = defgh AND (...)).
   so on...

 Thanks in advance

 Regards,
 Dilip TS




Re: largish test data set?

2007-09-17 Thread Grant Ingersoll
You might be interested in the Lucene Java contrib/Benchmark task,  
which provides an indexing implementation of a download of Wikipedia  
(available at http://people.apache.org/~gsingers/wikipedia/)


It is pretty trivial to convert the indexing code to send add  
commands to Solr.


HTH,
Grant

On Sep 17, 2007, at 6:06 AM, David Welton wrote:


Hi,

I'm in the process of evaluating solr and sphinx, and have come to
realize that actually having a large data set to run them against
would be handy.  However, I'm pretty new to both systems, so thought
that perhaps asking around my produce something useful.

What *I* mean by largish is something that won't fit into memory - say
5 or 6 gigs, which is probably puny for some and huge for others.

BTW, I would also welcome any input from others who have done the
above comparison, although what we'll be using it for is specific
enough that of course I'll need to do my own testing.

Thanks!
--
David N. Welton
http://www.welton.it/davidw/





Re: largish test data set?

2007-09-17 Thread Daniel Alheiros
Hi Yonik.

Do you have any performance statistics about those changes?
Is it possible to upgrade to this new Lucene version using the Solr 1.2
stable version?

Regards,
Daniel


On 17/9/07 17:37, Yonik Seeley [EMAIL PROTECTED] wrote:

 If you want to see what performance will be like on the next release,
 you could try upgrading Solr's internal version of lucene to trunk
 (current dev version)... there have been some fantastic improvements
 in indexing speed.
 
 For query speed/throughput, Solr 1.2 or trunk should do fine.
 
 -Yonik
 
 On 9/17/07, David Welton [EMAIL PROTECTED] wrote:
 Hi,
 
 I'm in the process of evaluating solr and sphinx, and have come to
 realize that actually having a large data set to run them against
 would be handy.  However, I'm pretty new to both systems, so thought
 that perhaps asking around my produce something useful.
 
 What *I* mean by largish is something that won't fit into memory - say
 5 or 6 gigs, which is probably puny for some and huge for others.
 
 BTW, I would also welcome any input from others who have done the
 above comparison, although what we'll be using it for is specific
 enough that of course I'll need to do my own testing.
 
 Thanks!
 --
 David N. Welton
 http://www.welton.it/davidw/
 


http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.



Re: 'suggest' query sorting

2007-09-17 Thread Matthew Runo
Hello! Were you able to find out anything? I'd be interested to know  
what you found out.


++
 | Matthew Runo
 | Zappos Development
 | [EMAIL PROTECTED]
 | 702-943-7833
++


On Sep 15, 2007, at 11:48 PM, Ryan McKinley wrote:


Hello-

I'm building an interface where I need to display matching options  
as a user types into a search box.  Something like google suggest,  
but it needs to be a little more flexible in its matches.


It first glance, I thought I just needed to write a filter that  
chunks each token into a set of prefixes.  Check SOLR-357 -- As  
Hoss points out, I may just be able to use the EdgeNGramFilterFactory.


I have the basics working, but need some help getting the details  
to behave properly.


Consider the strings:
 Canon PowerShot
 iPod Cable
 Canon EX PIXMA
 Video Card

If I query for 'ca' I expect to get all these back.  This works  
fine, but I need help with is ordering.


How can I boost words where the whole value (not just the token) is  
closer to the front of the value?  That is, I want 'ca' to return:

 1. Canon PowerShot
 2. Canon EX PIXMA
 3. iPod Cable
 4. Video Card
(actually 12 could be swapped)

After that works, how do I boost tokens that are closer together?   
If I search for 'canon p', how can I make sure the results are  
returned as:

 1. Canon PowerShot
 2. Canon EX PIXMA


thanks
ryan









Re: largish test data set?

2007-09-17 Thread Yonik Seeley
If you want to see what performance will be like on the next release,
you could try upgrading Solr's internal version of lucene to trunk
(current dev version)... there have been some fantastic improvements
in indexing speed.

For query speed/throughput, Solr 1.2 or trunk should do fine.

-Yonik

On 9/17/07, David Welton [EMAIL PROTECTED] wrote:
 Hi,

 I'm in the process of evaluating solr and sphinx, and have come to
 realize that actually having a large data set to run them against
 would be handy.  However, I'm pretty new to both systems, so thought
 that perhaps asking around my produce something useful.

 What *I* mean by largish is something that won't fit into memory - say
 5 or 6 gigs, which is probably puny for some and huge for others.

 BTW, I would also welcome any input from others who have done the
 above comparison, although what we'll be using it for is specific
 enough that of course I'll need to do my own testing.

 Thanks!
 --
 David N. Welton
 http://www.welton.it/davidw/



Re: largish test data set?

2007-09-17 Thread Karl Wettin


17 sep 2007 kl. 12.06 skrev David Welton:



I'm in the process of evaluating solr and sphinx, and have come to
realize that actually having a large data set to run them against
would be handy.  However, I'm pretty new to both systems, so thought
that perhaps asking around my produce something useful.

What *I* mean by largish is something that won't fit into memory - say
5 or 6 gigs, which is probably puny for some and huge for others.


IMDB is about 1.2GB of data:

http://www.imdb.com/interfaces#plain

You can extract real queries from the TPB data collection, it should  
contain about 1M queries in the movie category:


http://torrents.thepiratebay.org/3783572/ 
db_dump_and_query_log_from_piratebay.org__summer_of_2006.3783572.TPB.tor 
rent



--
karl


Re: Re[2]: multiple indices

2007-09-17 Thread Matt Kangas
Jack, the JNDI-enabling jarfiles now ship as part of the main .zip  
distribution. There is no need for a separate JettyPlus download as  
of Jetty 6.


I used Jetty 6.1.3 (http://dist.codehaus.org/jetty/jetty-6.1.x/ 
jetty-6.1.3.zip) at the time, and I am using only these jarfiles from  
the main distribution. I stripped everything else out that seemed  
unnecessary for running Solr.


lib/jetty-6.1.3.jar
lib/jetty-util-6.1.3.jar
lib/jsp-2.1/ant-1.6.5.jar
lib/jsp-2.1/core-3.1.1.jar
lib/jsp-2.1/jsp-2.1.jar
lib/jsp-2.1/jsp-api-2.1.jar
lib/naming/jetty-naming-6.1.3.jar
lib/plus/jetty-plus-6.1.3.jar
lib/servlet-api-2.5-6.1.3.jar

--Matt

On Sep 13, 2007, at 11:44 AM, Jack L wrote:


Thanks Matt, I'll give it a try! So this requires JettyPlus?

--
Best regards,
Jack

Wednesday, September 12, 2007, 5:14:32 AM, you wrote:


Jack, I've posted a complete recipe for running two Solr indices
within one Jetty 6 container:



http://wiki.apache.org/solr/SolrJetty



Scroll down to the part that says:

(7/2007 MattKangas) The recipe above didn't work for me with Jetty
6.1.3.

...

I'm glossing over a lot of details, so attached is a tarball with a
known-good configuration that runs two Solr instances inside one
Jetty container. I'm using Solr 1.2.0 and Jetty 6.1.3 respectively.





Hope this helps,
--matt



On Sep 11, 2007, at 11:52 AM, Jack L wrote:



I was going through some old emails on this topic. Rafael Rossini
figured
out how to run multiple indices on single instance of jetty but it
has to
be jetty plus. I guess jetty doesn't allow this? I suppose I can add
additional jars and make it work but I haven't tried that. It'll
always be much safer/simpler/less playing around if a feature is
available out of box.

I'm mentioning this again because I really think it's a desirable
feature,
especially because each JVM uses a lot of memory and sometimes it's
not possible to start a new jetty for each index due to memory
limitation.

I understand I can use a type field and mix doc types but this is  
not

ideal for two reasons:

1. it's easier to maintain separate indices. I can just wipe out all
the files and re-post an individual index. Much less posting work to
do as opposed to re-posting all docs. Or I can move one index to
another partition, or even to another server to run separately in
order to scale up. It'll be a problem (although solvable by deleting
and re-posting) with a mixed index.

2. my understanding is that mixed index means larger index files and
slower performance

JettyPlus's download links seem to be broken so I wasn't able to  
check

its download size. If not too big, maybe JettyPlus is an option?
If not, there should be a way to have this feature implemented on  
solr

side? Maybe by prefixing the REST URLs with index names...

--
Thanks,
Jack




--
Matt Kangas / [EMAIL PROTECTED]





--
Matt Kangas / [EMAIL PROTECTED]




Re: Indexing Speed

2007-09-17 Thread Mike Klaas

On 16-Sep-07, at 8:01 PM, erolagnab wrote:



Hi,

Just a FYI.

I've seen some posts mentioned that Solr can index 100-150 docs/s  
and the
comparison between embedded solr and HTTP. I've tried to do the  
indexing
with 1.7+ million docs, each doc has 30 fields among which 10  
fields are

indexed/stored and the rest are only stored. The result was pretty
impressive, it took approx 1.4 hour to finish. Noted that, the docs  
were
sent synchronously, one after the other. The solr server and client  
were

both running on Pentium Dual Core 3.2, 2G Ram, Ubuntu Feisty.

The only issue I noticed is that, Solr does occupy some amount of  
memory. In

the first run, after indexing around 500 thousands docs, it threw
OutOfMemory exception. In the second trial, I setup -Xms and -Xmx  
for the

JVM to run on 1G memory, Solr performed till the finish.


You can tune memory usage by setting maxBufferedDocs to a lower  
value.  Also, watch out for large individual docs.



Some questions
1) Is it a good practice to allow Solr indexing docs in real time  
(millions
docs per day)? What I'm worry is that, Solr may eat up the memory  
as it

goes.


You can tune max memory usage (see above).


2) If docs are sent asynchronously, how well could Solr can index?


As long as you don't send 1.7million docs at once, you should see a  
performance improvement.


-Mike


RE: Triggering snapshooter through web admin interface

2007-09-17 Thread Wu, Daniel
 There is no way to trigger snapshots taking through Solr's admin
interface
 now.  Taking a snapshot is a very light-weight operation.  It uses
hard
 links so each snapshot doesn't take up much additional disk space.  If
you
[Wu, Daniel] 
It is not a concern on the snapshot performance.  Rather, it is the data
consistency requirement.

I was also suggesting a new feature to allow sending messages to Solr
through http interface and a mechanism to handling the message on the
Solr server; in this case, a message to trigger snapshooter script.  It
seems to me, a very useful feature to help simplify operational issues.

 don't want to replicate your index while the big batch job is still
 running,
 you can disable snappuller on the slave while the batch job is running
and
 enable it after the batch job has completed.
 
[Wu, Daniel] 
Yes, it can be done that way but will not be as elegant.  The Solr
master can be well-known among the applications; however, the slaves
could be anywhere.  To turn off snapulling requires the knowledge of all
the Solr slaves as well as timing of the indexing.  It gets ugly when
there are multiple environments (e.g. dev, qa, stage, production) and
multiple indexes.

 


Faceting Vs using lucene filters ?

2007-09-17 Thread cricdigs

Hi,

I have a collection of blogs. Each Solr document has one blog with 3 fields
- blogger(id), title and blog text.
The search is performed over all 3 fields. When doing the search I need to
show 2 things:

1. Bloggers block with all the matching bloggers (so if a title, blog or
blogger contains the search term, I show the blogger's id)
2. Blogs block that shows the blog titles for the matching blogs.

The first block is my problem since it shows multiple instances of the same
blogger if that blogger has multiple matching blogs. I can use faceting to
show the bloggers but is there a better or more efficient way to do so? I
was thinking of creating a lucene filter to do this, is it feasible?
Basically, I need the unique bloggers from the index whose blogs match a
given search term.

Thanks!
-- 
View this message in context: 
http://www.nabble.com/Faceting-Vs-using-lucene-filters---tf4469665.html#a12744115
Sent from the Solr - User mailing list archive at Nabble.com.



RE: Triggering snapshooter through web admin interface

2007-09-17 Thread Chris Hostetter

: I was also suggesting a new feature to allow sending messages to Solr
: through http interface and a mechanism to handling the message on the
: Solr server; in this case, a message to trigger snapshooter script.  It
: seems to me, a very useful feature to help simplify operational issues.

it's been a while since i looked at the SolrEventListener stuff, but i 
think that would be pretty easy to develop as a plugin.

The existing postCommit/postOptimizefirstSearcher/newSearcher event 
listener tracking are part of hte SolrCore because it needs to know about 
them when managing the index ... but if you just wanted a way to trigger 
arbitrary events by name, the utility functions used in SolrCore could be 
reused by a custom plugin ... then you could reuse things like the 
RunExecutableListener from your own RequestHandler with the same 
solrconfig.xml syntax.

that would be a pretty cool addition to Solr ... an EventRequestHandler 
that takes in a single event param and triggers all of the Listeners 
configured for that even in the solrconfig.xml


-Hoss



Re: Faceting Vs using lucene filters ?

2007-09-17 Thread Chris Hostetter

: 1. Bloggers block with all the matching bloggers (so if a title, blog or
: blogger contains the search term, I show the blogger's id)

: The first block is my problem since it shows multiple instances of the same
: blogger if that blogger has multiple matching blogs. I can use faceting to
: show the bloggers but is there a better or more efficient way to do so? I

you've basically described the exact use case for faceting ... I'd be 
hard pressed to think of a more efficient way of listing every blogger who 
has at least one blog matching the query.



-Hoss



Re: Combining Proximity Range search

2007-09-17 Thread Chris Hostetter

: My document will have a multivalued compound field like
: 
: revision_01012007
: review_02012007
: 
: i am thinking of a query like comp:type:review date:[02012007 TO
: 02282007]~0

your best bet is to change that so revision and review are the names 
of a field, and do a range search on them as needed.




-Hoss



Re: 'suggest' query sorting

2007-09-17 Thread Chris Hostetter

: How can I boost words where the whole value (not just the token) is closer to
: the front of the value?  That is, I want 'ca' to return:
:  1. Canon PowerShot
:  2. Canon EX PIXMA
:  3. iPod Cable
:  4. Video Card
: (actually 12 could be swapped)

i would argue that you don't want #3 and #4 at all if you are doing query 
suggestion, instead make hte field you query use a KeywordTokenizer with 
the EdgeNGramFilter so ca only matches #1 and #2.

if you really want #3 and #4 to show up, then have two fields: one using 
whitespace tokenizer, one using keyword tokenizer; both using 
EdgeNGramFilter ... boost the query to the first field higher then the 
second field (or just rely on the coordFactor and the fact that ca will 
match on both fields for Canon PowerShot but only on thesecond field for 
iPod Cable

: After that works, how do I boost tokens that are closer together?  If I search
: for 'canon p', how can I make sure the results are returned as:
:  1. Canon PowerShot
:  2. Canon EX PIXMA

i think the two fields i described above will solve that problem as well.




-Hoss



Re: EdgeNGramTokenFilter, term position?

2007-09-17 Thread Chris Hostetter
: Should the EdgeNGramFilter use the same term position for the ngrams within a
: single token?

i can see the argument going both ways ... imagine a hypothetical 
CharSplitterTokenFilter that takes replaces each token in the stream with 
one token per character in the orriginal token (ie: hello becomes 
h,e,l,l,o) ... should those tokens all have the same position?  the have a 
logical ordered flow to them, so in theory they are sequential ... but 
they did occupy the same space in the orriginal token stream.

when in doubt: make it an option



-Hoss



Re: EdgeNGramTokenFilter, term position?

2007-09-17 Thread Yonik Seeley
On 9/16/07, Ryan McKinley [EMAIL PROTECTED] wrote:
 Should the EdgeNGramFilter use the same term position for the ngrams
 within a single token?

It feels like that is the right approach.
I don't see value in having them sequential, and I can think of uses
for having them overlap.

-Yonik


Re: Control index/store at document level

2007-09-17 Thread Chris Hostetter

: nope, the field options are created on startup -- you can't change them
: dynamically (i don't know all the details, but I think it is a file format
: issue, not just a configuration issue)

In the underlying Lucene library most of these options can be controlled 
per document, but Solr simplifies this away into configuration options 
instead of run time input ... it's a feature not a bug :)

: I'm not sure how your app is structured, from what you describe, it sounds
: like you need two fields, one that is indexed and not stored and another that
: is stored and not indexed.  For each revision, put text into the indexed
: field.  for the primary document, put it in both.

I concur.  copyField can make this easy ... make the source field 
stored, and the destination field indexed.  for the primary doc add to 
the source and let the copyField do it's magic ... for all other revisions 
add directly to the destination field yourself.



-Hoss



Re: Solr - rudimentary problems

2007-09-17 Thread Chris Hostetter
:  The corresponding entry for this field in schema.xml is :
:  field name=id type=text indexed=true
: stored=true multiValued=false  required=true/

i'm guessing text is from the example schema.xml ... this is not a good 
type to use for a uniqueId field ... that alone might be causing some of 
your problems with replaceing docs ...  try string

: 2) Also, at the time of deleting a document, by providing its ID(exactly
: similar to the deleteById proc in the Embedded Solr example) , we find that
: the document is not getting deleted(and we also do not get any errors).

sounds like the same problem ... i'm guessing you are using a method that 
assumes the id has already been transformed into the internal 
representation ... with text that might be lowercased, or stemmed, 
etc





-Hoss



Re: solr locked itself out

2007-09-17 Thread Adrian Sutton

ulimit is unlimited and cat /proc/sys/fs/file-max 11769


I just went through the same kind of mistake - ulimit doesn't report  
what you think it does, what you should check is ulimit -n (the -n  
isn't just the option to set the value). If you're using bash as your  
shell that will almost certainly by 1024 which I've found isn't  
enough to search and write at the same time.  The commit winds up  
throwing an exception and the lock file(s) get left around causing  
further problems.


The first thing I'd try is upping the ulimit -n to 2 and see if  
that resolves the issue, it did for me.


Regards,

Adrian Sutton
http://www.symphonious.net


UserTagDesign

2007-09-17 Thread Karl Wettin
I've been looking at http://wiki.apache.org/solr/UserTagDesign on  
and off for a while and think all the use cases could be explained  
with simple UML class diagram semantics:



[Taggable](tag:Tag)-- {0..*} |--- {0..*} --(tag:Tag)[Tagger]

 |
 [Tagging]


Rendered: http://ginandtonique.org/~kalle/tagging.pdf

This is of course a design that might not fit everybody, it could be  
represented using an n-ary association or what not. But I find the  
text on the wiki much easier to follow with this in my head.


How (or even if) one would represent this in a index is a completely  
different story.


Translated to Java the diagram would look something like this:


/** the user */
class Tagger {
  MapTag, SetTagging taggingsByTag;
}

/** the content */
class Taggable {
  MapTag, Set Tagging taggingsByTag;
}

/** content tagging */
class Tagging {
  Tagger tagger;
  Taggable tagged;
  Date created;
}

class Tag {
  String text;
}


Thought it was better to let you people decide whether or not this  
fits in the wiki.



--
karl




Re: 'suggest' query sorting

2007-09-17 Thread Ryan McKinley


if you really want #3 and #4 to show up, then have two fields: one using 
whitespace tokenizer, one using keyword tokenizer; both using 
EdgeNGramFilter ... boost the query to the first field higher then the 
second field (or just rely on the coordFactor and the fact that ca will 
match on both fields for Canon PowerShot but only on thesecond field for 
iPod Cable




I'm working with person names that are sometimes reversed... it needs to 
treat the last name (that may be the first name) with the same weight.


Yes, this scheme works great.  Thanks.

I added the config I'm using to SOLR-357 and closed the issue. 
Hopefully the next person searching for how to do this will know to look 
at the EdgeNGramFilter


ryan


RE: Triggering snapshooter through web admin interface

2007-09-17 Thread Wu, Daniel


 -Original Message-
 From: Chris Hostetter [mailto:[EMAIL PROTECTED]
 Sent: Monday, September 17, 2007 1:28 PM
 To: solr-user@lucene.apache.org
 Subject: RE: Triggering snapshooter through web admin interface
 
 
 : I was also suggesting a new feature to allow sending messages to
Solr
 : through http interface and a mechanism to handling the message on
the
 : Solr server; in this case, a message to trigger snapshooter script.
It
 : seems to me, a very useful feature to help simplify operational
issues.
 
 it's been a while since i looked at the SolrEventListener stuff, but i
 think that would be pretty easy to develop as a plugin.
 
 The existing postCommit/postOptimizefirstSearcher/newSearcher event
 listener tracking are part of hte SolrCore because it needs to know
about
 them when managing the index ... but if you just wanted a way to
trigger
 arbitrary events by name, the utility functions used in SolrCore could
be
 reused by a custom plugin ... then you could reuse things like the
 RunExecutableListener from your own RequestHandler with the same
 solrconfig.xml syntax.
 
 that would be a pretty cool addition to Solr ... an
EventRequestHandler
 that takes in a single event param and triggers all of the Listeners
 configured for that even in the solrconfig.xml
 
 
 -Hoss
 
[Wu, Daniel] That sounds great.  Do I need to create a JIRA ticket?



Re: Solr - rudimentary problems

2007-09-17 Thread Venkatraman S
C'est Parfait! .. yes - that was the problem.
thanks a lot.

I am compiling a complete list of FAQs  - will update it in the wiki soon.

-vEnKAt

On 9/18/07, Chris Hostetter [EMAIL PROTECTED] wrote:

 :  The corresponding entry for this field in schema.xml is :
 :  field name=id type=text indexed=true
 : stored=true multiValued=false  required=true/

 i'm guessing text is from the example schema.xml ... this is not a good
 type to use for a uniqueId field ... that alone might be causing some of
 your problems with replaceing docs ...  try string

 : 2) Also, at the time of deleting a document, by providing its ID(exactly
 : similar to the deleteById proc in the Embedded Solr example) , we find
 that
 : the document is not getting deleted(and we also do not get any errors).

 sounds like the same problem ... i'm guessing you are using a method that
 assumes the id has already been transformed into the internal
 representation ... with text that might be lowercased, or stemmed,
 etc





 -Hoss




--