Re: Is complex query like this possible?
Hello, Chris. Thank you and Mikhail for the explanation. I'll think about changing the model of indexing to be able to handle this case. : DIR:true : PATH:/root/folder1/folder2/ : NAME:folder3 : SIZE:0 ... : DIR:false : PATH:/root/folder1/folder2/folder3/ : NAME:image.jpg : SIZE:1234567 ... : your solution). Also, in my previous example a file of specified type : may be deeper than one level: if there are /root/folder1, /root/folder2 : and file /root/folder1/aaa/bbb/ccc/image.jpg, and I query for folder, : only folder1 must be returned. I don't think you're going to find an *easy* way to do what you want -- solr is designed to return *documents* that match queries, and you've modeled documents to match individual files -- so it's not easy to get solr to return the ancestor directories of those files as results. grouping could be used for something like find the parent directories of files that match this query if you grouped on the PATH, but that won't help you with your expectation that an example like /root/folder1/aaa/bbb/ccc/image.jpg should return /root/folder1 -Hoss -- Best regards, Asv mailto:asvs...@gmail.com
Is complex query like this possible?
Hello, We use Solr to search over a filesystem, so there are a lot of files and folders indexed, name and path of each file are stored in different fields. The task is to find folders by name AND containing at least one file of specific type somewhere inside. For example, we search by phrase test and for JPG files and have two folders: 1) test1 - empty folder 2) test2 - contains 1 file abcd.jpg inside. Search result must only contain folder test2, because test1 does not correspond to second criteria. SQL equivalent of such search query looks like: SELECT * FROM indexed_files t1 WHERE t1.name LIKE '%test%' AND (SELECT COUNT(*) FROM indexed_files t2 WHERE t2.path LIKE CONCAT(t1.path, '%') AND t2.name LIKE '%jpg') 0; The question is: is it possible to do such search in Solr by single query? Single query is important because we need to use Solr's paging (start and rows parameters), so we should avoid filtering of wrong results in our code. I've read Solr wiki about nested queries but haven't found a way to do it. BTW, does Solr provide equivalent of SELECT COUNT(*) statement to access count of found records directly in Solr query? Or such complex query is completely impossible? -- Best regards, Asv mailto:asvs...@gmail.com
Re: Is complex query like this possible?
Hello Sergey, if your docs looks like: PATH:'directory','tree','sements','test1' FILES:'filename1','ext1','filename2','ext2','filename3','ext3','filename4','ext4' you can search it: +PATH:test1 +FILES:jpg 2012/2/1 Sergei Ananko asvs...@gmail.com Hello, We use Solr to search over a filesystem, so there are a lot of files and folders indexed, name and path of each file are stored in different fields. The task is to find folders by name AND containing at least one file of specific type somewhere inside. For example, we search by phrase test and for JPG files and have two folders: 1) test1 - empty folder 2) test2 - contains 1 file abcd.jpg inside. Search result must only contain folder test2, because test1 does not correspond to second criteria. SQL equivalent of such search query looks like: SELECT * FROM indexed_files t1 WHERE t1.name LIKE '%test%' AND (SELECT COUNT(*) FROM indexed_files t2 WHERE t2.path LIKE CONCAT(t1.path, '%') AND t2.name LIKE '%jpg') 0; The question is: is it possible to do such search in Solr by single query? Single query is important because we need to use Solr's paging (start and rows parameters), so we should avoid filtering of wrong results in our code. I've read Solr wiki about nested queries but haven't found a way to do it. BTW, does Solr provide equivalent of SELECT COUNT(*) statement to access count of found records directly in Solr query? Or such complex query is completely impossible? -- Best regards, Asv mailto:asvs...@gmail.com -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re[2]: Is complex query like this possible?
Hello, Mikhail. Each index record looks like: DIR:true PATH:/root/folder1/folder2/ NAME:folder3 SIZE:0 ... This record represents folder /root/folder1/folder2/folder3 DIR:false PATH:/root/folder1/folder2/folder3/ NAME:image.jpg SIZE:1234567 ... This is a file /root/folder1/folder2/folder3/image.jpg E. g. PATH is a path to parent directory, NAME is actual name of file/folder. We do not store list of children in folder record (like in your solution). Also, in my previous example a file of specified type may be deeper than one level: if there are /root/folder1, /root/folder2 and file /root/folder1/aaa/bbb/ccc/image.jpg, and I query for folder, only folder1 must be returned. Thanks 2012/2/1, 21:33:41: Hello Sergey, if your docs looks like: PATH:'directory','tree','sements','test1' FILES:'filename1','ext1','filename2','ext2','filename3','ext3','filename4','ext4' you can search it: +PATH:test1 +FILES:jpg 2012/2/1 Sergei Ananko asvs...@gmail.com Hello, We use Solr to search over a filesystem, so there are a lot of files and folders indexed, name and path of each file are stored in different fields. The task is to find folders by name AND containing at least one file of specific type somewhere inside. For example, we search by phrase test and for JPG files and have two folders: 1) test1 - empty folder 2) test2 - contains 1 file abcd.jpg inside. Search result must only contain folder test2, because test1 does not correspond to second criteria. SQL equivalent of such search query looks like: SELECT * FROM indexed_files t1 WHERE t1.name LIKE '%test%' AND (SELECT COUNT(*) FROM indexed_files t2 WHERE t2.path LIKE CONCAT(t1.path, '%') AND t2.name LIKE '%jpg') 0; The question is: is it possible to do such search in Solr by single query? Single query is important because we need to use Solr's paging (start and rows parameters), so we should avoid filtering of wrong results in our code. I've read Solr wiki about nested queries but haven't found a way to do it. BTW, does Solr provide equivalent of SELECT COUNT(*) statement to access count of found records directly in Solr query? Or such complex query is completely impossible? -- Best regards, Asv mailto:asvs...@gmail.com -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics -- С Ñ?важением, asv mailto:asvs...@gmail.com
Re: Re[2]: Is complex query like this possible?
Sergey, Try to employ http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternTokenizerFactory Regards On Wed, Feb 1, 2012 at 11:59 PM, asv - gmail asvs...@gmail.com wrote: ** Hello, Mikhail. Each index record looks like: DIR:true PATH:/root/folder1/folder2/ NAME:folder3 SIZE:0 ... This record represents folder /root/folder1/folder2/folder3 DIR:false PATH:/root/folder1/folder2/folder3/ NAME:image.jpg SIZE:1234567 ... This is a file /root/folder1/folder2/folder3/image.jpg E. g. PATH is a path to parent directory, NAME is actual name of file/folder. We do not store list of children in folder record (like in your solution). Also, in my previous example a file of specified type may be deeper than one level: if there are /root/folder1, /root/folder2 and file /root/folder1/aaa/bbb/ccc/image.jpg, and I query for folder, only folder1 must be returned. Thanks 2012/2/1, 21:33:41: Hello Sergey, if your docs looks like: PATH:'directory','tree','sements','test1' FILES:'filename1','ext1','filename2','ext2','filename3','ext3','filename4','ext4' you can search it: +PATH:test1 +FILES:jpg 2012/2/1 Sergei Ananko asvs...@gmail.com Hello, We use Solr to search over a filesystem, so there are a lot of files and folders indexed, name and path of each file are stored in different fields. The task is to find folders by name AND containing at least one file of specific type somewhere inside. For example, we search by phrase test and for JPG files and have two folders: 1) test1 - empty folder 2) test2 - contains 1 file abcd.jpg inside. Search result must only contain folder test2, because test1 does not correspond to second criteria. SQL equivalent of such search query looks like: SELECT * FROM indexed_files t1 WHERE t1.name LIKE '%test%' AND (SELECT COUNT(*) FROM indexed_files t2 WHERE t2.path LIKE CONCAT(t1.path, '%') AND t2.name LIKE '%jpg') 0; The question is: is it possible to do such search in Solr by single query? Single query is important because we need to use Solr's paging (start and rows parameters), so we should avoid filtering of wrong results in our code. I've read Solr wiki about nested queries but haven't found a way to do it. BTW, does Solr provide equivalent of SELECT COUNT(*) statement to access count of found records directly in Solr query? Or such complex query is completely impossible? -- Best regards, Asv mailto:asvs...@gmail.com -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics -- С уважением, asv mailto:asvs...@gmail.com asvs...@gmail.com -- Sincerely yours Mikhail Khludnev Lucid Certified Apache Lucene/Solr Developer Grid Dynamics http://www.griddynamics.com mkhlud...@griddynamics.com
Re[2]: Is complex query like this possible?
: DIR:true : PATH:/root/folder1/folder2/ : NAME:folder3 : SIZE:0 ... : DIR:false : PATH:/root/folder1/folder2/folder3/ : NAME:image.jpg : SIZE:1234567 ... : your solution). Also, in my previous example a file of specified type : may be deeper than one level: if there are /root/folder1, /root/folder2 : and file /root/folder1/aaa/bbb/ccc/image.jpg, and I query for folder, : only folder1 must be returned. I don't think you're going to find an *easy* way to do what you want -- solr is designed to return *documents* that match queries, and you've modeled documents to match individual files -- so it's not easy to get solr to return the ancestor directories of those files as results. grouping could be used for something like find the parent directories of files that match this query if you grouped on the PATH, but that won't help you with your expectation that an example like /root/folder1/aaa/bbb/ccc/image.jpg should return /root/folder1 -Hoss