Re: Is complex query like this possible?

2012-02-02 Thread Sergei Ananko
Hello, Chris.

Thank you and Mikhail for the explanation. I'll think about changing the model 
of indexing to be able to handle this case.

 : DIR:true
 : PATH:/root/folder1/folder2/
 : NAME:folder3
 : SIZE:0
 ...
 : DIR:false
 : PATH:/root/folder1/folder2/folder3/
 : NAME:image.jpg
 : SIZE:1234567
 ...
 : your solution). Also, in my previous example a file of specified type
 : may be deeper than one level: if there are /root/folder1, /root/folder2
 : and file /root/folder1/aaa/bbb/ccc/image.jpg, and I query for folder,
 : only folder1 must be returned.

 I don't think you're going to find an *easy* way to do what you want --
 solr is designed to return *documents* that match queries, and you've 
 modeled documents to match individual files -- so it's not easy to get
 solr to return the ancestor directories of those files as results.

 grouping could be used for something like find the parent directories of
 files that match this query if you grouped on the PATH, but that won't
 help you with your expectation that 
 an example like /root/folder1/aaa/bbb/ccc/image.jpg should return 
 /root/folder1

 -Hoss


-- 
Best regards,
Asv mailto:asvs...@gmail.com 

Is complex query like this possible?

2012-02-01 Thread Sergei Ananko
Hello, 

We use Solr to search over a filesystem, so there are a lot of files and 
folders indexed, name and path of each file are stored in different fields. The 
task is to find folders by name AND containing at least one file of specific 
type somewhere inside. For example, we search by phrase test and for JPG 
files and have two folders:

1) test1 - empty folder
2) test2 - contains 1 file abcd.jpg inside.

Search result must only contain folder test2, because test1 does not 
correspond to second criteria.

SQL equivalent of such search query looks like:

SELECT * FROM indexed_files t1 WHERE t1.name LIKE '%test%' AND (SELECT COUNT(*) 
FROM indexed_files t2 WHERE t2.path LIKE CONCAT(t1.path, '%') AND t2.name LIKE 
'%jpg')  0; 

The question is: is it possible to do such search in Solr by single query? 
Single query is important because we need to use Solr's paging (start and 
rows parameters), so we should avoid filtering of wrong results in our code. 
I've read Solr wiki about nested queries but haven't found a way to do it. BTW, 
does Solr provide equivalent of SELECT COUNT(*) statement to access count of 
found records directly in Solr query? Or such complex query is completely 
impossible?

-- 
Best regards,
 Asv  mailto:asvs...@gmail.com 

Re: Is complex query like this possible?

2012-02-01 Thread Mikhail Khludnev
Hello Sergey,

if your docs looks like:

PATH:'directory','tree','sements','test1'
FILES:'filename1','ext1','filename2','ext2','filename3','ext3','filename4','ext4'
you can search it:
+PATH:test1 +FILES:jpg

2012/2/1 Sergei Ananko asvs...@gmail.com

 Hello,

 We use Solr to search over a filesystem, so there are a lot of files and
 folders indexed, name and path of each file are stored in different fields.
 The task is to find folders by name AND containing at least one file of
 specific type somewhere inside. For example, we search by phrase test and
 for JPG files and have two folders:

 1) test1 - empty folder
 2) test2 - contains 1 file abcd.jpg inside.

 Search result must only contain folder test2, because test1 does not
 correspond to second criteria.

 SQL equivalent of such search query looks like:

 SELECT * FROM indexed_files t1 WHERE t1.name LIKE '%test%' AND (SELECT
 COUNT(*) FROM indexed_files t2 WHERE t2.path LIKE CONCAT(t1.path, '%') AND
 t2.name LIKE '%jpg')  0;

 The question is: is it possible to do such search in Solr by single query?
 Single query is important because we need to use Solr's paging (start and
 rows parameters), so we should avoid filtering of wrong results in our
 code. I've read Solr wiki about nested queries but haven't found a way to
 do it. BTW, does Solr provide equivalent of SELECT COUNT(*) statement to
 access count of found records directly in Solr query? Or such complex query
 is completely impossible?

 --
 Best regards,
  Asv  mailto:asvs...@gmail.com




-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com
mkhlud...@griddynamics.com


Re[2]: Is complex query like this possible?

2012-02-01 Thread asv - gmail
Hello, Mikhail.

Each index record looks like:

DIR:true
PATH:/root/folder1/folder2/
NAME:folder3
SIZE:0
...

This record represents folder /root/folder1/folder2/folder3

DIR:false
PATH:/root/folder1/folder2/folder3/
NAME:image.jpg
SIZE:1234567
...

This is a file /root/folder1/folder2/folder3/image.jpg

E. g. PATH is a path to parent directory, NAME is actual name of file/folder. 
We do not store list of children in folder record (like in your solution). 
Also, in my previous example a file of specified type may be deeper than one 
level: if there are /root/folder1, /root/folder2 and file 
/root/folder1/aaa/bbb/ccc/image.jpg, and I query for folder, only folder1 
must be returned.   

Thanks


2012/2/1, 21:33:41:


Hello Sergey,

if your docs looks like:

PATH:'directory','tree','sements','test1'
FILES:'filename1','ext1','filename2','ext2','filename3','ext3','filename4','ext4'
you can search it: 
+PATH:test1 +FILES:jpg


2012/2/1 Sergei Ananko asvs...@gmail.com

Hello,

We use Solr to search over a filesystem, so there are a lot of files and 
folders indexed, name and path of each file are stored in different fields. The 
task is to find folders by name AND containing at least one file of specific 
type somewhere inside. For example, we search by phrase test and for JPG 
files and have two folders:

1) test1 - empty folder
2) test2 - contains 1 file abcd.jpg inside.

Search result must only contain folder test2, because test1 does not 
correspond to second criteria.

SQL equivalent of such search query looks like:

SELECT * FROM indexed_files t1 WHERE t1.name LIKE '%test%' AND (SELECT COUNT(*) 
FROM indexed_files t2 WHERE t2.path LIKE CONCAT(t1.path, '%') AND t2.name LIKE 
'%jpg')  0;

The question is: is it possible to do such search in Solr by single query? 
Single query is important because we need to use Solr's paging (start and 
rows parameters), so we should avoid filtering of wrong results in our code. 
I've read Solr wiki about nested queries but haven't found a way to do it. BTW, 
does Solr provide equivalent of SELECT COUNT(*) statement to access count of 
found records directly in Solr query? Or such complex query is completely 
impossible?

--
Best regards,
 Asv  mailto:asvs...@gmail.com 




-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics







-- 
С Ñ?важением,
 asv  mailto:asvs...@gmail.com

Re: Re[2]: Is complex query like this possible?

2012-02-01 Thread Mikhail Khludnev
Sergey,

Try to employ
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternTokenizerFactory

Regards


On Wed, Feb 1, 2012 at 11:59 PM, asv - gmail asvs...@gmail.com wrote:

 **

 Hello, Mikhail.


 Each index record looks like:


 DIR:true

 PATH:/root/folder1/folder2/

 NAME:folder3

 SIZE:0

 ...


 This record represents folder /root/folder1/folder2/folder3


 DIR:false

 PATH:/root/folder1/folder2/folder3/

 NAME:image.jpg

 SIZE:1234567

 ...


 This is a file /root/folder1/folder2/folder3/image.jpg


 E. g. PATH is a path to parent directory, NAME is actual name of
 file/folder. We do not store list of children in folder record (like in
 your solution). Also, in my previous example a file of specified type may
 be deeper than one level: if there are /root/folder1, /root/folder2 and
 file /root/folder1/aaa/bbb/ccc/image.jpg, and I query for folder, only
 folder1 must be returned.


 Thanks



 2012/2/1, 21:33:41:




 Hello Sergey,


 if your docs looks like:


 PATH:'directory','tree','sements','test1'


 FILES:'filename1','ext1','filename2','ext2','filename3','ext3','filename4','ext4'

 you can search it:

 +PATH:test1 +FILES:jpg



 2012/2/1 Sergei Ananko asvs...@gmail.com


 Hello,


 We use Solr to search over a filesystem, so there are a lot of files and
 folders indexed, name and path of each file are stored in different fields.
 The task is to find folders by name AND containing at least one file of
 specific type somewhere inside. For example, we search by phrase test and
 for JPG files and have two folders:


 1) test1 - empty folder

 2) test2 - contains 1 file abcd.jpg inside.


 Search result must only contain folder test2, because test1 does not
 correspond to second criteria.


 SQL equivalent of such search query looks like:


 SELECT * FROM indexed_files t1 WHERE t1.name LIKE '%test%' AND (SELECT
 COUNT(*) FROM indexed_files t2 WHERE t2.path LIKE CONCAT(t1.path, '%') AND
 t2.name LIKE '%jpg')  0;


 The question is: is it possible to do such search in Solr by single query?
 Single query is important because we need to use Solr's paging (start and
 rows parameters), so we should avoid filtering of wrong results in our
 code. I've read Solr wiki about nested queries but haven't found a way to
 do it. BTW, does Solr provide equivalent of SELECT COUNT(*) statement to
 access count of found records directly in Solr query? Or such complex query
 is completely impossible?


 --

 Best regards,

  Asv  mailto:asvs...@gmail.com





 --

 Sincerely yours

 Mikhail Khludnev

 Lucid Certified

 Apache Lucene/Solr Developer

 Grid Dynamics







 --

 С уважением,

  asv  mailto:asvs...@gmail.com asvs...@gmail.com




-- 
Sincerely yours
Mikhail Khludnev
Lucid Certified
Apache Lucene/Solr Developer
Grid Dynamics

http://www.griddynamics.com
 mkhlud...@griddynamics.com


Re[2]: Is complex query like this possible?

2012-02-01 Thread Chris Hostetter

: DIR:true
: PATH:/root/folder1/folder2/
: NAME:folder3
: SIZE:0
...
: DIR:false
: PATH:/root/folder1/folder2/folder3/
: NAME:image.jpg
: SIZE:1234567
...
: your solution). Also, in my previous example a file of specified type 
: may be deeper than one level: if there are /root/folder1, /root/folder2 
: and file /root/folder1/aaa/bbb/ccc/image.jpg, and I query for folder, 
: only folder1 must be returned.

I don't think you're going to find an *easy* way to do what you want -- 
solr is designed to return *documents* that match queries, and you've 
modeled documents to match individual files -- so it's not easy to get 
solr to return the ancestor directories of those files as results.

grouping could be used for something like find the parent directories of 
files that match this query if you grouped on the PATH, but that won't 
help you with your expectation that 
an example like /root/folder1/aaa/bbb/ccc/image.jpg should return 
/root/folder1

-Hoss