I have a project where I am working with nested data (not that deep, but multiple lists) and would love to get some advice from other experienced developers. I've read most of the books on Solr (including Solr In Action) and though they provide good information (though dated) on the actual indexing mechanism, not many deal with this issue very much.

If there are other resources that aren't necessarily Solr specific that can help here, please feel free to point those out.

Here is the structure I'm working with. I've made it generic to simplify things, but the intent is here.

{
    id: 1,

    _type: "book",
    name: "My Martian",
    genre: "Science Fiction",
    edits: [
        {
            _type: "book_action",
            action: "Modify",
            chapter: 3,
            description: "Corrected spelling for interstellar"
        }, {
            _type: "book_action",
            action: "Removal",
            chapter: 24,
            description: "Removed chapter as it adds no value to the story"
        }
    ],
    chapters: [
        {
            _type: "book_chapter",
            chapter_number: 1,
            chapter_title: "The Test"
        }, {
            _type: "book_chapter",
            chapter_number: 2,
            chapter_title: "The Next Test"
        }
    ]
}

My first attempt was to just add both lists through SolrJ (can't do this with the JSON interface since it doesn't allow multiple _childDocuments_ at the same level). That works and I'm able to use the _type value to distinguish between them. However, my problem here is that the users want to be able to search for any field in the top level of the data as well as within the lists. For example (using sql for clarity only):

select * from book_index where genre = "Science Fiction" and action = "Removal" and chapter_number = 2;

The problem I'm having with this sort of search is that, based on what I know, the {!child ....... and {!parent ..... parsers won't give me access to all fields like this.

I've looked at flattening the data similar to the following:

{
    id: 1,
    name: "My Martian",
    genre: "Science Fiction",
    edit_action_3: {
        action: "Modify",
        chapter: 3,
        description: "Corrected spelling for interstellar"
    },
    edit_action_24: {
        action: "Removal",
        chapter: 24,
        description: "Removed chapter as it adds no value to the story"
    },
    chapter_1: {
        chapter_number: 1,
        chapter_title: "The Test"
    },
    chapter_2: {
        chapter_number: 2,
        chapter_title: "The Next Test"
    }

}

This does flatten things out so that the above query would be able to search on any field, but it's a real kludge and makes it nearly impossible to get just a list of chapters or actions.

So anyone have any thoughts? (FYI, this is my first Solr project so I'm really starting from scratch here).

Thanks




---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

Reply via email to