Are there any gotchas I should be aware of when creating a document that
could contain thousands of pages of text ( a Company and thousands of
nested Files ) in addition to dozens/hundreds of fields?
On Friday, May 9, 2014 9:54:40 AM UTC-7, Brian Jones wrote:
>
> It seems like nesting the Files within the Company docs may be the only
> solution here. That is definitely an option. I had indexed the Files as
> children of Companies so that I could query the Files as a separate index (
> which I also need to do ), but can maintain a separate index alltogether if
> need be.
>
> On Friday, May 9, 2014 9:10:19 AM UTC-7, Brian Jones wrote:
>>
>> I have an index with parent documents ( Companies ), that have children (
>> Files ). Each Company can have hundreds of Files. Companies and Files
>> both have many fields.
>>
>> The search I'm trying to perform is the Company that best matches based
>> on it's own fields and the fields of it's children ( the Files ). The
>> current query I run is a Bool-Should query where I perform a has_child
>> query on the files and a regular query on the Companies. I only require a
>> minimum of one match so, as I understand it, a Company that matches it's
>> own fields and one of it's children will score higher than a Company that
>> only matchesit's own fields. You'll see I also have to apply a nuber of
>> filters to the Companies.
>>
>> I'm wondering if there is a way to query the system where it will take
>> all the children into account, and not just one. If ten Files match the
>> query, then that Company result would likely score higher than a Company
>> that only had a few files match ... obviously there would be other scoring
>> going on ... so maybe some sort of multiplyer applied to the sum of
>> children scores would be appropriate. It's defining a query that matches
>> multiple children that I'm unable to figure out.
>>
>> Here is an example of the query that I currently use:
>>
>> {
>> "query": {
>> "filtered": {
>> "filter": {
>> "and": [
>> {
>> "terms": {
>> "_cache": true,
>> "execution": "or",
>> "locations.state": [
>> "california",
>> "maryland"
>> ]
>> }
>> },
>> {
>> "terms": {
>> "_cache": true,
>> "execution": "and",
>> "industries.term.not_analyzed": [
>> "aerospace",
>> "defense"
>> ]
>> }
>> },
>> {
>> "geo_distance": {
>> "locations.geolocation": {
>> "lat": "41",
>> "lon": "-82"
>> },
>> "distance": "25mi"
>> }
>> }
>> ]
>> },
>> "query": {
>> "bool": {
>> "should": [
>> {
>> "query_string": {
>> "default_field": "_all",
>> "query": "adhesive"
>> }
>> },
>> {
>> "has_child": {
>> "type": "file",
>> "query": {
>> "query_string": {
>> "default_field": "_all",
>> "query": "adhesive"
>> }
>> }
>> }
>> }
>> ],
>> "minimum_number_should_match": 1
>> }
>> }
>> }
>> }
>> }
>>
>>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/44ac19ce-7571-41e4-a607-5e6b2fbdd1d4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.