Using multiple language stop words in Solr Core

2021-02-11 Thread Abhay Kumar
Hello Team,

Solr provides some data type out of box in managed schema for different 
languages such as english, french, japanies etc.

We are using common data type "text_general" for fields declaration and using 
stopwards.txt for stopword filtering.



  
  
  
  


  
  
  
  

  

While syncing data to Solr core we are importing different languages text in 
the fields such as french, english, german etc.

My query is shall we use all different language stopwords into same 
"stopwards.txt" file or how solr use different language stopwords?



Warm Regards,

Abhay Kumar | Lead Developer
401/402, Pride Portal, Shivaji Housing Society, Off. S. B. Road | Shivaji 
Nagar, Pune-411 016
+91 20 2563 1011 | Mobile: +91 9096644108
anjusoftware.com<https://anjusoftware.com/>
[cid:image001.png@01D70099.4ACD8C20]<https://anjusoftware.com/>[cid:image002.png@01D70099.4ACD8C20]<https://www.linkedin.com/company/anju-software/>[cid:image003.png@01D70099.4ACD8C20]<https://www.facebook.com/Anju-Software-1415613681916676/>[cid:image004.png@01D70099.4ACD8C20]<https://twitter.com/AnjuSoftware>



Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


RE: Getting error "Bad Message 414 reason: URI Too Long"

2021-01-14 Thread Abhay Kumar
Thank you Nicolas. Yes, we are making Post request to Solr using SolrNet 
library.
The current request length is approx. 32K characters, I have tested with 10K 
characters length request and it works fine.

Any suggestion to increase request length size in Solr configuration.

Thanks.
Abhay

-Original Message-
From: Nicolas Franck 
Sent: 14 January 2021 15:12
To: solr-user@lucene.apache.org
Subject: Re: Getting error "Bad Message 414 reason: URI Too Long"

Euh, sorry: I did not read your message well enough.
You did actually use a post request, with the parameters in the body
(your example suggests otherwise)

> On 14 Jan 2021, at 10:37, Nicolas Franck  wrote:
>
> I believe you can also access this path in a HTTP POST request.
> That way you do no hit the URI size limit
>
> cf. 
> https://stackoverflow.com/questions/2997014/can-you-use-post-to-run-a-query-in-solr-select
>
> I think some solr libraries already use this approach (e.g.  WebService::Solr 
> in perl)
>
> On 14 Jan 2021, at 10:31, Abhay Kumar 
> mailto:abhay.ku...@anjusoftware.com>> wrote:
>
> Hello,
>
> I am trying to post below query to Solr but getting error as “Bad Message 
> 414reason: URI Too Long”.
>
> I am sending query using SolrNet library. Please suggest how to resolve this 
> issue.
>
> Query : 
> http://localhost:8983/solr/documents/select?q=%22Geisteswissenschaften%22%20OR%20%22Humanities%22%20OR%20%22Art%22%20OR%20%22Arts%22%20OR%20%22Caricatures%22%20OR%20%22Caricature%22%20OR%20%22Cartoon%22%20OR%20%22Engraving%20and%20Engravings%22%20OR%20%22Engravings%20and%20Engraving%22%20OR%20%22Engraving%22%20OR%20%22Engravings%22%20OR%20%22Human%20Body%22%20OR%20%22Human%20Bodies%22%20OR%20%22Human%20Figure%22%20OR%20%22Human%20Figures%22%20OR%20%22menschlicher%20K%C3%B6rper%22%20OR%20%22Menschliche%20Gestalt%22%20OR%20%22Body%20Parts%22%20OR%20%22K%C3%B6rperteile%22%20OR%20%22Body%20Parts%20and%20Fluids%22%20OR%20%22K%C3%B6rperteile%20und%20-fl%C3%BCssigkeiten%22%20OR%20%22Medical%20Illustration%22%20OR%20%22Medical%20Illustrations%22%20OR%20%22medizinische%20Illustration%22%20OR%20%22Anatomy%2C%20Artistic%22%20OR%20%22Artistic%20Anatomy%22%20OR%20%22Artistic%20Anatomies%22%20OR%20%22Medicine%20in%20Art%22%20OR%20%22Medicine%20in%20Arts%22%20OR%20%22Numismatics%22%20OR%20%22M%C3%BCnzkunde%22%20OR%20%22Coins%22%20OR%20%22Coin%22%20OR%20%22M%C3%BCnzen%22%20OR%20%22Medals%22%20OR%20%22Medal%22%20OR%20%22Denkm%C3%BCnzen%22%20OR%20%22Gedenkm%C3%BCnzen%22%20OR%20%22Medaillen%22%20OR%20%22Paintings%22%20OR%20%22Painting%22%20OR%20%22Philately%22%20OR%20%22Philatelies%22%20OR%20%22Postage%20Stamps%22%20OR%20%22Postage%20Stamp%22%20OR%20%22Briefmarken%22%20OR%20%22Portraits%22%20OR%20%22Portrait%22%20OR%20%22Sculpture%22%20OR%20%22Sculptures%22%20OR%20%22Awards%20and%20Prizes%22%20OR%20%22Prizes%20and%20Awards%22%20OR%20%22Awards%22%20OR%20%22Award%22%20OR%20%22Prizes%22%20OR%20%22Prize%22%20OR%20%22Nobel%20Prize%22%20OR%20%22Ethics%22%20OR%20%22Egoism%22%20OR%20%22Ethical%20Issues%22%20OR%20%22Ethical%20Issue%22%20OR%20%22Metaethics%22%20OR%20%22Metaethik%22%20OR%20%22Moral%20Policy%22%20OR%20%22Moral%20Policies%22%20OR%20%22Moralischer%20Grundsatz%22%20OR%20%22Natural%20Law%22%20OR%20%22Natural%20Laws%22%20OR%20%22Naturrecht%22%20OR%20%22Situational%20Ethics%22%20OR%20%22Bioethical%20Issues%22%20OR%20%22Bioethical%20Issue%22%20OR%20%22Bioethics%22%20OR%20%22Biomedical%20Ethics%22%20OR%20%22Health%20Care%20Ethics%22%20OR%20%22Ethics%2C%20Clinical%22%20OR%20%22Clinical%20Ethics%22%20OR%20%22klinische%20Ethik%22%20OR%20%22Complicity%22%20OR%20%22Mitt%C3%A4terschaft%22%20OR%20%22Moral%20Complicity%22%20OR%20%22Moralische%20Komplizenschaft%22%20OR%20%22Moralische%20Mitt%C3%A4terschaft%22%20OR%20%22Conflict%20of%20Interest%22%20OR%20%22Interest%20Conflict%22%20OR%20%22Interest%20Conflicts%22%20OR%20%22Ethical%20Analysis%22%20OR%20%22Ethical%20Analyses%22%20OR%20%22Casuistry%22%20OR%20%22Retrospective%20Moral%20Judgment%22%20OR%20%22Retrospective%20Moral%20Judgments%22%20OR%20%22retrospektive%20Moralische%20Beurteilung%22%20OR%20%22Wedge%20Argument%22%20OR%20%22Wedge%20Arguments%22%20OR%20%22Slippery%20Slope%20Argument%22%20OR%20%22Slippery%20Slope%20Arguments%22%20OR%20%22Argument%20der%20schiefen%20Ebene%22%20OR%20%22Ethical%20Relativism%22%20OR%20%22Ethical%20Review%22%20OR%20%22Ethikgutachten%22%20OR%20%22Ethics%20Consultation%22%20OR%20%22Ethics%20Consultations%22%20OR%20%22Ethical%20Theory%22%20OR%20%22Ethical%20Theories%22%20OR%20%22Normative%20Ethics%22%20OR%20%22Normative%20Ethic%22%20OR%20%22Consequentialism%22%20OR%20%22Deontological%20Ethics%22%20OR%20%22Deontological%20Ethic%22%20OR%20%22Deontologie%22%20OR%20%22Ethik%20der%20Pflichtenlehre%22%20OR%20%22Teleological%20Ethics%22%20OR%20%22Teleological%20Ethic%22%20OR%20%22Teleologische%20Ethik%22%20OR%20%22Utilitarianism%22%20OR%20%22Utilitarianisms%22%20OR%20%22Utilitarismus%22%20OR%20

Getting error "Bad Message 414 reason: URI Too Long"

2021-01-14 Thread Abhay Kumar
eren%20und%20Gravierungen%22%20OR%20%22Menschlicher%20K%C3%B6rper%22%20OR%20%22Medizinische%20Illustration%22%20OR%20%22Medizin%20in%20der%20Kunst%22%20OR%20%22Numismatik%22%20OR%20%22Malerei%22%20OR%20%22Philatelie%22%20OR%20%22Portr%C3%A4ts%22%20OR%20%22Portraits%20%5BPublication%20Type%5D%22%20OR%20%22Portraits%20(PT)%22%20OR%20%22Portr%C3%A4ts%20%5BDokumenttyp%5D%22%20OR%20%22Bildhauerei%22%20OR%20%22Anatomie%20in%20der%20Kunst%22%20OR%20%22Bioethische%20Fragestellungen%22%20OR%20%22Bioethik%22%20OR%20%22Ethik%2C%20klinische%22%20OR%20%22Interessenkonflikt%22%20OR%20%22Physician%20Self-Referral%22%20OR%20%22Physician%20Self%20Referral%22%20OR%20%22Physician%20Self-Referrals%22%20OR%20%22Eigen%C3%BCberweisung%20Arzt%22%20OR%20%22Arzt%20Eigen%C3%BCberweisung%22%20OR%20%22Arzt%2C%20Eigen%C3%BCberweisung%22%20OR%20%22Pr%C3%BCfung%20ethischer%20Standards%22%20OR%20%22Ethikberatung%22%20OR%20%22Ethiker%22%20OR%20%22Ethikkommissionen%22%20OR%20%22Ethik-Kommissionen%2C%20klinische%22%20OR%20%22Ethikkommissionen%20der%20Forschung%22%20OR%20%22Ethik%2C%20Wirtschafts-%22%20OR%20%22Ethik%2C%20institutionelle%22%20OR%20%22Ethik%2C%20Berufs-%22%20OR%20%22Ethik-Kodizes%22%20OR%20%22Ethik%2C%20pharmazeutische%22%20OR%20%22Helsinki-Deklaration%22%20OR%20%22Ethik%2C%20Forschungs-%22%20OR%20%22Berufliches%20Fehlverhalten%22%20OR%20%22Wissenschaftliches%20Fehlverhalten%22%20OR%20%22Astrologie%22%20OR%20%22Anthroposophie%22%20OR%20%22Buddhismus%22%20OR%20%22Christentum%22%20OR%20%22Katholizismus%22%20OR%20%22Christliche%20Wissenschaft%22%20OR%20%22Kirche%20Jesu%20Christi%20der%20Heiligen%20der%20Letzten%20Tage%22%20OR%20%22Christliche%20Orthodoxie%22%20OR%20%22Zeugen%20Jehovas%22%20OR%20%22Protestantismus%22%20OR%20%22Heilige%22%20OR%20%22Hinduismus%22%20OR%20%22Islam%22%20OR%20%22Judentum%22%20OR%20%22Religion%20und%20Medizin%22%20OR%20%22Religion%20und%20Psychologie%22%20OR%20%22Religion%20und%20Naturwissenschaft%22%20OR%20%22Religion%20und%20Sexualit%C3%A4t%22%20OR%20%22Religionsphilosophien%22%20OR%20%22Konfuzianismus%22%20OR%20%22Mystik%22%20OR%20%22Spiritualismus%22%20OR%20%22Joga%22%20OR%20%22Theologie%22


Warm Regards,

Abhay Kumar | Lead Developer
401/402, Pride Portal, Shivaji Housing Society, Off. S. B. Road | Shivaji 
Nagar, Pune-411 016
+91 20 2563 1011 | Mobile: +91 9096644108
anjusoftware.com<https://anjusoftware.com/>
[cid:image001.png@01D6EA85.EA051580]<https://anjusoftware.com/>[cid:image002.png@01D6EA85.EA051580]<https://www.linkedin.com/company/anju-software/>[cid:image003.png@01D6EA85.EA051580]<https://www.facebook.com/Anju-Software-1415613681916676/>[cid:image004.png@01D6EA85.EA051580]<https://twitter.com/AnjuSoftware>



Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


How to remove special characters from suggestion in Solr

2020-10-28 Thread Abhay Kumar
Hello,

We are using below suggest component in our solr implementation.


  
 analyzinginfixsuggester
 analyzinginfixlookupfactory
 documentdictionaryfactory
 text_auto
 prefix_text
 true
 true
  
  
 FreeTextSuggester
 FreeTextLookupFactory
 DocumentDictionaryFactory
 text
 5
  
 text_general
 true
 true
  
   







  
  
  

  

For one of document, we have large data and while syncing this document using 
SolrNet library. We are getting below exception.

SuggestComponent Exception in building suggester index for: 
AnalyzingInfixSuggester
java.lang.IllegalArgumentException: Document contains at least one immense term 
in field="exacttext" (whose UTF8 encoding is longer than the max length 32766), 
all of which were skipped.  Please correct the analyzer to not produce such 
terms.  The prefix of the first immense term is: '[77, 101, 100, 105, 99, 97, 
108, 32, 108, 97, 117, 110, 99, 104, 32, 112, 97, 99, 107, 10, 65, 98, 105, 
114, 97, 116, 101, 114, 111, 110]...', original message: bytes can be at most 
32766 in length; got 95994

Please help to resolve this issue.

Any help to remove special characters from suggestion result will also work.

Thanks.
Abhay


Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


Filtering Parent documents based on Child documents Facets selection

2020-10-16 Thread Abhay Kumar
I have a nested documents which I am syncing in Solr :

{
   "id":"NCT04372953",
   "title":"Positive End-Expiratory Pressure (PEEP) Levels During Resuscitation 
of Preterm Infants at Birth (The POLAR Trial) ",
   "phase":"N/A",
   "status":"Not yet recruiting",
   "studytype":"Interventional",
   "SponsorName":[
  "Murdoch Childrens Research Institute|Children''s Hospital of 
Philadelphia|University of Amsterdam"
   ],
   "SponsorRole":[
  "lead|collaborator"
   ],
   "source":"Murdoch Childrens Research Institute",
   "sponsorrole":[
  "lead",
  "collaborator"
   ],
   "sponsorname":[
  "Murdoch Childrens Research Institute",
  "Children''s Hospital of Philadelphia",
  "University of Amsterdam"
   ],
   "investigatorsaffiliation":"",
   "investigatorname":[
  ""
   ],
   "therapeuticareaname":"",
   "text_suggest":[
  ""
   ],
   "investigatorrole":"",
   "_version_":1680437253090836480,
   "sites":{
  "id":"51002566",
  "facilitytype":"Hospital",
  "facilityname":"The Royal Women''s Hospital, Melbourne Australia",
  "facilitycountry":"Australia",
  "facilitystate":"Victoria",
  "facilitycity":"Parkville",
  "nodetype":"cnode",
  "facilityzip":"",
  "_nest_parent_":"NCT04372953",
  "phase":"",
  "studytype":"",
  "investigatorsaffiliation":"",
  "source":"",
  "title":"",
  "sponsorrole":[
 ""
  ],
  "investigatorname":[
 ""
  ],
  "therapeuticareaname":"",
  "text_suggest":[
 ""
  ],
  "investigatorrole":"",
  "sponsorname":[
 ""
  ],
  "status":"",
  "_version_":1680437253090836480
   },
   "investigators":[
  {
 "id":"6300662",
 "investigatorname":[
"Louise Owen"
 ],
 "nodetype":"cnode",
 "investigatorrole":"Principal Investigator",
 "investigatorsaffiliation":"The Royal Women''s Hospital, Melbourne 
Australia",
 "CongressScore":"",
 "TrialsScore":"Low",
 "PublicationScore":"",
 "_nest_parent_":"NCT04372953",
 "phase":"",
 "studytype":"",
 "source":"",
 "title":"",
 "sponsorrole":[
""
 ],
"therapeuticareaname":"",
 "text_suggest":[
""
 ],
 "sponsorname":[
""
 ],
 "status":"",
 "_version_":1680437253090836480
  },
  {
 "id":"6426782",
 "investigatorname":[
"David Tingay, MBBS FRACP"
 ],
 "nodetype":"cnode",
 "investigatorrole":"Study Chair",
 "investigatorsaffiliation":"Royal Children''s Hospital, Melbourne 
Australia",
 "CongressScore":"",
 "TrialsScore":"",
 "PublicationScore":"",
 "_nest_parent_":"NCT04372953",
 "phase":"",
 "studytype":"",
 "source":"",
 "title":"",
 "sponsorrole":[
""
 ],
 "therapeuticareaname":"",
 "text_suggest":[
""
 ],
 "sponsorname":[
""
 ],
 "status":"",
 "_version_":1680437253090836480
  },
  {
 "id":"7663364",
 "investigatorname":[
"Omar Kamlin"
 ],
 "nodetype":"cnode",
 "investigatorrole":"Principal Investigator",
 "investigatorsaffiliation":"The Royal Women''s Hospital, Melbourne 
Australia",
 "CongressScore":"",
 "TrialsScore":"Low",
 "PublicationScore":"",
 "_nest_parent_":"NCT04372953",
 "phase":"",
 "studytype":"",
 "source":"",
 "title":"",
 "sponsorrole":[
""
 ],
 "therapeuticareaname":"",
 "text_suggest":[
""
 ],
 "sponsorname":[
""
 ],
 "status":"",
 "_version_":1680437253090836480
  }
   ],
   "therapeuticareas":[
  {
 "id":"ta-0-NCT04372953",
 "therapeuticareaname":"Premature Birth",
 "text_prefixauto":"Premature Birth",
 "text_suggest":[
"Premature Birth"
 ],
 "diseaseareas":[
""
 ],
 "nodetype":"cnode",
 "_nest_parent_":"NCT04372953",
 "phase":"",
 "studytype":"",
 "investigatorsaffiliation":"",
 "source":"",
 "title":"",
 "sponsorrole":[
""
 ],
 "investigatorname":[
""
 ],
 "investigatorrole":"",
 "sponsorname":[
""
 ],
 "status":"",
 "_version_":1680437253090836480,
 "therapeuticareaname_facet":"Premature Birth",
 "diseaseareas_facet":[
""
 ]
  },
  {
 "id":"ta-1-NCT04372953",
 "therapeuticareaname":"Lung Injury",
 "text_prefixauto":"Lung Injury",
 

Solr 8.6.2 Facets query for Nested documents

2020-10-13 Thread Abhay Kumar
Hello Team,

I have sync following nested document in Solr 8.6.2.

{
   "id":"NCT04372953",
   "title":"Positive End-Expiratory Pressure (PEEP) Levels During Resuscitation 
of Preterm Infants at Birth (The POLAR Trial) ",
   "phase":"N/A",
   "status":"Not yet recruiting",
   "studytype":"Interventional",
   "SponsorName":[
  "Murdoch Childrens Research Institute|Children''s Hospital of 
Philadelphia|University of Amsterdam"
   ],
   "SponsorRole":[
  "lead|collaborator"
   ],
   "source":"Murdoch Childrens Research Institute",
   "sponsorrole":[
  "lead",
  "collaborator"
   ],
   "sponsorname":[
  "Murdoch Childrens Research Institute",
  "Children''s Hospital of Philadelphia",
  "University of Amsterdam"
   ],
   "investigatorsaffiliation":"",
   "investigatorname":[
  ""
   ],
   "therapeuticareaname":"",
   "text_suggest":[
  ""
   ],
   "investigatorrole":"",
   "_version_":1680437253090836480,
   "sites":{
  "id":"51002566",
  "facilitytype":"Hospital",
  "facilityname":"The Royal Women''s Hospital, Melbourne Australia",
  "facilitycountry":"Australia",
  "facilitystate":"Victoria",
  "facilitycity":"Parkville",
  "nodetype":"cnode",
  "facilityzip":"",
  "_nest_parent_":"NCT04372953",
  "phase":"",
  "studytype":"",
  "investigatorsaffiliation":"",
  "source":"",
  "title":"",
  "sponsorrole":[
 ""
  ],
  "investigatorname":[
 ""
  ],
  "therapeuticareaname":"",
  "text_suggest":[
 ""
  ],
  "investigatorrole":"",
  "sponsorname":[
 ""
  ],
  "status":"",
  "_version_":1680437253090836480
   },
   "investigators":[
  {
 "id":"6300662",
 "investigatorname":[
"Louise Owen"
 ],
 "nodetype":"cnode",
 "investigatorrole":"Principal Investigator",
 "investigatorsaffiliation":"The Royal Women''s Hospital, Melbourne 
Australia",
 "CongressScore":"",
 "TrialsScore":"Low",
 "PublicationScore":"",
 "_nest_parent_":"NCT04372953",
 "phase":"",
 "studytype":"",
 "source":"",
 "title":"",
 "sponsorrole":[
""
 ],
 "therapeuticareaname":"",
 "text_suggest":[
""
 ],
 "sponsorname":[
""
 ],
 "status":"",
 "_version_":1680437253090836480
  },
  {
 "id":"6426782",
 "investigatorname":[
"David Tingay, MBBS FRACP"
 ],
 "nodetype":"cnode",
 "investigatorrole":"Study Chair",
 "investigatorsaffiliation":"Royal Children''s Hospital, Melbourne 
Australia",
 "CongressScore":"",
 "TrialsScore":"",
 "PublicationScore":"",
 "_nest_parent_":"NCT04372953",
 "phase":"",
 "studytype":"",
 "source":"",
 "title":"",
 "sponsorrole":[
""
 ],
 "therapeuticareaname":"",
 "text_suggest":[
""
 ],
 "sponsorname":[
""
 ],
 "status":"",
 "_version_":1680437253090836480
  },
  {
 "id":"7663364",
 "investigatorname":[
"Omar Kamlin"
 ],
 "nodetype":"cnode",
 "investigatorrole":"Principal Investigator",
 "investigatorsaffiliation":"The Royal Women''s Hospital, Melbourne 
Australia",
 "CongressScore":"",
 "TrialsScore":"Low",
 "PublicationScore":"",
 "_nest_parent_":"NCT04372953",
 "phase":"",
 "studytype":"",
 "source":"",
 "title":"",
 "sponsorrole":[
""
 ],
 "therapeuticareaname":"",
 "text_suggest":[
""
 ],
 "sponsorname":[
""
 ],
 "status":"",
 "_version_":1680437253090836480
  }
   ],
   "therapeuticareas":[
  {
 "id":"ta-0-NCT04372953",
 "therapeuticareaname":"Premature Birth",
 "text_prefixauto":"Premature Birth",
 "text_suggest":[
"Premature Birth"
 ],
 "diseaseareas":[
""
 ],
 "nodetype":"cnode",
 "_nest_parent_":"NCT04372953",
 "phase":"",
 "studytype":"",
 "investigatorsaffiliation":"",
 "source":"",
 "title":"",
 "sponsorrole":[
""
 ],
 "investigatorname":[
""
 ],
 "investigatorrole":"",
 "sponsorname":[
""
 ],
 "status":"",
 "_version_":1680437253090836480,
 "therapeuticareaname_facet":"Premature Birth",
 "diseaseareas_facet":[
""
 ]
  },
  {
 "id":"ta-1-NCT04372953",
 "therapeuticareaname":"Lung Injury",
 "text_prefixauto":"Lung Injury",
 

Index Deeply Nested documents and retrieve a full nested document in solr

2020-09-24 Thread Abhay Kumar
Hello Team,

Can someone please help to index the below sample json document into Solr.

I have following queries on indexing multi level child document.


  1.  Can we specify names to documents hierarchy such as "therapeuticareas" or 
"sites" while indexing.
  2.  How can we index document at multi-level hierarchy.

I have following queries on retrieving the result.


  1.  How can I retrieve result with full nested structure.

[{
   "id": "NCT0102",
   "title": "Congenital Adrenal Hyperplasia: Calcium Channels as 
Therapeutic Targets",
   "phase": "Phase 1/Phase 2",
   "status": "Completed",
   "studytype": "Interventional",
   "enrollmenttype": "",
   "sponsorname": ["National Center for Research Resources (NCRR)"],
   "sponsorrole": ["lead"],
   "score": [0],
   "source": "National Center for Research Resources (NCRR)",
   "therapeuticareas": [{
 "taid": "ta1",
 "ta": "Lung Cancer",
 "diseaseAreas": ["Oncology, 
Respiratory tract diseases"],
 "pubmeds": [{
"pmbid": "pm1",
"articleTitle": 
"Consensus minimum data set for lung cancer multidisciplinary teams Results of 
a Delphi process",
"revisedDate": 
"2018-12-11T18:30:00Z"
 }],
 "conferences": [{
"confid": "conf1",
"conferencename": 
"American Academy of Neurology Annual Meeting",
"conferencetopic": 
"Avances en el manejo de los trastornos del movimiento hipercineticos",
"conferencedate": 
"2019-05-08T18:30:00Z"
 }]
  },
  {
 "taid": "ta2",
 "ta": "Breast Cancer",
 "diseaseAreas": ["Oncology"],
 "pubmeds": [],
 "conferences": []
  }
   ],

   "sites": [{
  "siteid": "site1",
  "type": "Hospital",
  "institutionname": "Methodist Health System",
  "country": "United States",
  "state": "Texas",
  "city": "Dallas",
  "zip": ""
   }],

   "investigators": [{
  "invid": "inv1",
  "investigatorname": "Bryan A Faller",
  "role": "Principal Investigator",
  "location": "",
  "score": ""
   }],

   "Drugs": [{
  "id": "11",
  "drugname": "Methotrexate",
  "activeIngredient": "Methotrexate Sodium"
   }]
}]

Thanks.
Abhay

Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


Result Clustering Vs Semantic Knowledge Graphs in Solr

2020-09-08 Thread Abhay Kumar
Hello,

Can someone please help to let me understand what is difference between Result 
Clustering and "Semantic Knowledge Graphs" components of Solr.

In which scenario we should use "Result Clustering" or "Semantic Knowledge 
Graphs".

Thanks.
Abhay

Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


Semantic Knowledge Graph Jar File

2020-09-04 Thread Abhay Kumar
Hello,

I need to integrate Semantic Knowledge Graph with Solr 7.7.0 instance.
Can someone help to provide the jar file for "Semantic Knowledge Graph". I work 
on .net platform and I am not aware how to build using Maven.
So, I want compiled jar file for "Semantic Knowledge Graph".

Also, any help on integration with Solr and "Semantic Knowledge Graph" will be 
appreciated.

Regards,
Abhay

Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


Re: Comma delemitered words shawn in terms like one word.

2011-05-27 Thread abhay kumar
Thanks I was looking exactly for this.
I needed to spli tokens based on comma.

On Fri, Jun 18, 2010 at 10:12 PM, Joe Calderon calderon@gmail.comwrote:

 set generateWordParts=1 on wordDelimiter or use
 PatternTokenizerFactory to split on commas


 http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternTokenizerFactory


 you can use the analysis page to see what your filter chains are going
 to do before you index

 /admin/analysis.jsp

 On Fri, Jun 18, 2010 at 6:41 AM, Vitaliy Avdeev vavd...@sistyma.net
 wrote:
  Hello.
  In indexing text I have such string John,Mark,Sam. Then I looks at it in
  TermVectorComponent it looks like this johnmarksam.
 
  I am using this type for storing data
 
 fieldType name=textTight2 class=solr.TextField
  positionIncrementGap=100 
   analyzer
 tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory/
 filter class=solr.SynonymFilterFactory synonyms=synonyms.txt
  ignoreCase=true expand=false/
 filter class=solr.StopFilterFactory ignoreCase=true
  words=stopwords.txt/
 filter class=solr.WordDelimiterFilterFactory
  generateWordParts=0 generateNumberParts=0 catenateWords=1
  catenateNumbers=1 catenateAll=0/
 filter class=solr.LowerCaseFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
 /fieldType
 
  What filter I need to use to get John Mark Sam as different words?
 




-- 
Thanks and Regards
Abhay Kumar Singh


How to open/update/delete remote index ?

2010-06-18 Thread abhay kumar
Hi,

I am working with solr in production which is configured on remote server .

I need to delete some documents from solr index.

I know this can be done by curl by calling solr update request handler.
But i'm looking for GUI tool.

I tried luke but luke doesn't open remote index.

Do we have any tool which can open/delete/update remote index ?

A quick reply will be appreciated.

Regards,
Abhay


Re: using DataImportHandler with ExtractRequestHandler ?

2009-10-14 Thread abhay kumar
Thanks Steven for the quick reply ..

On Wed, Oct 14, 2009 at 1:56 AM, Steven A Rowe sar...@syr.edu wrote:

 See http://issues.apache.org/jira/browse/SOLR-1358

 Steve

  -Original Message-
  From: abhay kumar [mailto:abhay...@gmail.com]
  Sent: Tuesday, October 13, 2009 8:59 AM
  To: solr-user@lucene.apache.org; solr-user-
  sc.1251278899.kmoigkhhnpcnaplolgcb-
  abhayait=gmail@lucene.apache.org; solr-user-
  sc.1253450516.pndkohgcdcidbclnkelo-abhayait=gmail@lucene.apache.org
  Subject: using DataImportHandler with ExtractRequestHandler ?
 
  Hi ,
 
  We are using solr-1.4 for our search module.
 
  We have a long schema (35 fields) whose some field values comes from
  database 
  some field(Actually 1) value comes from different file formats.
 
  We are able to index different file formats using Solr Cell
  ExtractRequestHandler .
  Data from database can be indexed using DataImportHandler.
 
  Now, I want to call both(DataImportHandler  ExtractRequestHandler )
  requesthandlers at the same time for each document.
  Is it possible?How?
 Or
  Can DataImportHandler call ExtractRequestHandler or vice versa ?
 
  Or
  Can these two RequestHandlers be called combined for one document ?
 
  If yes, How ?
 
  *For e.g.*
 
  Let's take 2 fields..
 
  resumeContent = it's value is stored in a file(pdf,word,doc) . So we
  need
  to use ExtractRequestHandler to get it's value.
 
  resumeTitle = It's value is stored in database. So I need to use
  DataImportHandler to get it's value from database.
 
  These 2 fields make one document.
 
 
  How  DataImportHandler can be used with ExtractRequestHandler or vice
  versa
  for the same document which some
  field values comes form database  some field values comes from
  different
  document formats ?
 
  I don't want to extract  different document formats  store it's
  content(body) in database before indexing .
 
  We are in agile development work.
 
  So a quick response will be appreciated.
 
  Regards,
  Abhay



using DataImportHandler with ExtractRequestHandler ?

2009-10-13 Thread abhay kumar
Hi ,

We are using solr-1.4 for our search module.

We have a long schema (35 fields) whose some field values comes from
database 
some field(Actually 1) value comes from different file formats.

We are able to index different file formats using Solr Cell
ExtractRequestHandler .
Data from database can be indexed using DataImportHandler.

Now, I want to call both(DataImportHandler  ExtractRequestHandler )
requesthandlers at the same time for each document.
Is it possible?How?
   Or
Can DataImportHandler call ExtractRequestHandler or vice versa ?

Or
Can these two RequestHandlers be called combined for one document ?

If yes, How ?

*For e.g.*

Let's take 2 fields..

resumeContent = it's value is stored in a file(pdf,word,doc) . So we need
to use ExtractRequestHandler to get it's value.

resumeTitle = It's value is stored in database. So I need to use
DataImportHandler to get it's value from database.

These 2 fields make one document.


How  DataImportHandler can be used with ExtractRequestHandler or vice versa
for the same document which some
field values comes form database  some field values comes from different
document formats ?

I don't want to extract  different document formats  store it's
content(body) in database before indexing .

We are in agile development work.

So a quick response will be appreciated.

Regards,
Abhay


Re: acts_as_solr integeration with solr separately

2009-09-21 Thread abhay kumar
Thanks Erik for your knd response.

I am using acts_as_solr stable release v0.9 .

I downloaded it from http://github.com/mattmatt/acts_as_solr/tree/master

Regards
Abhay


On Mon, Sep 21, 2009 at 3:41 PM, Erik Hatcher erik.hatc...@gmail.comwrote:

 acts_as_solr accesses the Solr server listed in the config solr.yml file.
  You don't have to use the start/stop Rake actions, they are really just
 conveniences for development/testing (I personally would launch Solr
 separately in production though).

 Out of curiosity, what acts_as_solr version/branch are you using?

Erik


 On Sep 18, 2009, at 8:30 AM, abhay kumar wrote:

  Hi,

 I have setup solr search server in tomcat.

 I am able to fire queries(of any knid)  get results in xml format.

 Now i want to Integerate it(solr) with ruby on rails .

 I know ruby on rails has inbuilt plugin acts_as_solr which helps in
 integerating(talking) with solr.

 acts_as_solr comes bundled with solr web application with jetty server.

 But i don't wanna use this inbuilt solr web application .

 e.g. i don't wanna do rake solr:start.

 I am running solr as different search server in tomcat at port 8983.(url
 http://localhost:8983/solr/  all other urls are listening)

 Now, I want to talk to this solr server (separate) using acts_as_solr
 plugin.

 Questions:
 1)Can anybody point me how to do this?
   Any tutorial ?
 2)What changes I had to make in acts_as_solr plugin?

 3)Any good pointers(urls) will be appreciated...

 Regards
 Abhay





Re: acts_as_solr integeration with solr separately

2009-09-20 Thread abhay kumar
Anybody with his bright ideas

Regards
Abhay

On Fri, Sep 18, 2009 at 5:30 AM, abhay kumar abhay...@gmail.com wrote:

 Hi,

 I have setup solr search server in tomcat.

 I am able to fire queries(of any knid)  get results in xml format.

 Now i want to Integerate it(solr) with ruby on rails .

 I know ruby on rails has inbuilt plugin acts_as_solr which helps in
 integerating(talking) with solr.

 acts_as_solr comes bundled with solr web application with jetty server.

 But i don't wanna use this inbuilt solr web application .

 e.g. i don't wanna do rake solr:start.

 I am running solr as different search server in tomcat at port 8983.(url
 http://localhost:8983/solr/  all other urls are listening)

 Now, I want to talk to this solr server (separate) using acts_as_solr
 plugin.

 Questions:
 1)Can anybody point me how to do this?
 Any tutorial ?
 2)What changes I had to make in acts_as_solr plugin?

 3)Any good pointers(urls) will be appreciated...

 Regards
 Abhay



acts_as_solr integeration with solr separately

2009-09-18 Thread abhay kumar
Hi,

I have setup solr search server in tomcat.

I am able to fire queries(of any knid)  get results in xml format.

Now i want to Integerate it(solr) with ruby on rails .

I know ruby on rails has inbuilt plugin acts_as_solr which helps in
integerating(talking) with solr.

acts_as_solr comes bundled with solr web application with jetty server.

But i don't wanna use this inbuilt solr web application .

e.g. i don't wanna do rake solr:start.

I am running solr as different search server in tomcat at port 8983.(url
http://localhost:8983/solr/  all other urls are listening)

Now, I want to talk to this solr server (separate) using acts_as_solr
plugin.

Questions:
1)Can anybody point me how to do this?
Any tutorial ?
2)What changes I had to make in acts_as_solr plugin?

3)Any good pointers(urls) will be appreciated...

Regards
Abhay


Re: Retrieving a field from all result docuemnts couple of more queries

2009-09-16 Thread abhay kumar
Hi,

1)Solr has various type of caches . We can specify how many documents cache
can have at a time.
   e.g. if windowsize=50
   50 results will be cached in queryResult Cache.
if user makes a new request to server for results after 50
documents a new request will be sent to the server  server will retrieve
next 50 results in the cache.
   http://wiki.apache.org/solr/SolrCaching
   Yes, solr looks into the cache to retrieve the fields to be returned.

2) Yes, we can have different tokenizers or filters for index  search. We
need not create a different fieldtype. We need to configure the same
fieldtype (datatype) for index  search analyzers sections differently.

   e.g.

fieldType name=textSpell class=solr.TextField
positionIncrementGap=100 stored=false multiValued=true
  *analyzer type=index*
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/

 !--filter class=solr.SynonymFilterFactory
synonyms=Synonyms.txt ignoreCase=true expand=false/--
 filter class=solr.StopFilterFactory ignoreCase=true
words=stopwords.txt/
 filter class=solr.StandardFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
   /analyzer
  * analyzer type=query*
 tokenizer class=solr.StandardTokenizerFactory/
 filter class=solr.LowerCaseFilterFactory/

 filter class=solr.StandardFilterFactory/
 filter class=solr.RemoveDuplicatesTokenFilterFactory/
  /analyzer
/fieldType



Regards,
Abhay

On Tue, Sep 15, 2009 at 6:41 PM, Shashikant Kore shashik...@gmail.comwrote:

 Hi,

 I am familiar with Lucene and trying out Solr.

 I have index which was created outside solr. The index is fairly
 simple with two field - document_id   content. The query result needs
 to return all the document IDs. The result need not be ordered by the
 score. For this, in Lucene, I use custom hit collector with search to
 get results quickly. The index has a few million documents and queries
 returning hundreds of thousands of documents are not uncommon. So, the
 speed is crucial here.

 Since retrieving the document_id for each document is slow, I am using
 FileldCache to store the values of document_id. For all the results
 collected (in a bitset) with hit collector, document_id field is
 retrieved from the fieldcache.

 1. How can I effectively disable scoring? I have read that
 ConstantScoreQuery is quite fast, but from the code, I see that it is
 used only for wildcard queries. How can I use ConstantScoreQuery for
 all the queries (boolean, term, phrase, ..)?  Also, is
 ConstantScoreQuery as fast as a custom hit collector?

 2. How can Solr take advantage of the fieldcache while returning the
 field document_id? The documentation says, fieldcache can be
 explicitly auto warmed with Solr.  If fieldcache is available and
 initialized at the beginning, will solr look into the cache to
 retrieve the fields to be returned?

 3. If there is an additional field for stemmed_content on which search
 needs to use different analyzer, I suppose, that could be specified by
 fieldType attribute in the schema.

 Thank you,

 --shashi