Bug#784540: recollindex always indexes /tmp despite being given a different path

2015-05-07 Thread Jean-Francois Dockes
The attached patch checks that all topdirs elements are absolute paths
before starting the indexing. The change will be in all future releases
until relative paths can be properly supported.

Cheers,

jf


recoll-check-topdirs-abs.diff
Description: Binary data


Bug#784540: recollindex always indexes /tmp despite being given a different path

2015-05-06 Thread Helmut Grohne
Package: recoll
Version: 1.20.3-2
Severity: important
File: /usr/bin/recollindex

recollindex -c someconfdir some/path used to index to index some/path.
Since upgrading from 1.17.3-2 to 1.20.3-2 it indexes /tmp instead.

It seems that the new version does not provide any means to index files
outside /tmp. If this observation is wrong, please downgrade the
severity of this bug.

Helmut


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#784540: recollindex always indexes /tmp despite being given a different path

2015-05-06 Thread Helmut Grohne
On Wed, May 06, 2015 at 06:15:46PM +0200, Jean-Francois Dockes wrote:
 Relative paths in topdirs, should either be banned or be made absolute, not
 against the current directory, but against the configuration directory
 itself.

That makes sense to me. My use case actually is having topdirs relative
to the configuration directory. In the absence of that being implemented
I resorted to relative to the working directory and wrapped it in a
script doing the chdir.

  - On your side, please use absolute paths inside recoll.conf. This is the
workaround for now.

I confirm that this works for me.

  - The next minor Recoll release will generate an error if relative paths
are found in topdirs.

I think that a proper error message closes this bug.

  - I am going to think a bit more about this. If confdir-relative paths can
make sense in some circumstances (maybe they could have a use for
removable media if I can make it work), I'll change the code to make it
useful, but this is a big change, maybe in 1.21

Looking forward to this new feature. Removable (or relocatable media) is
exactly the use case.

 In any case, thank you very much for tracking this out and finding the root
 cause.

Thanks for maintaining and developing this useful piece of software.

Helmut


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#784540: recollindex always indexes /tmp despite being given a different path

2015-05-06 Thread Jean-Francois Dockes
Helmut Grohne writes:
  Control: severity -1 normal
  Control: retitle -1 topdirs no longer accepts relative paths
  
  Thanks for your quick reply!
  
  On Wed, May 06, 2015 at 04:42:02PM +0200, Jean-Francois Dockes wrote:
   The way to specify things to be indexed is to set the topdirs variable
   inside someconfdir/recoll.conf. This can be set by editing the file or from
   the indexing configuration section of the GUI.
  
   What is the value of this variable inside 'someconfdir' in the above test ?
  
  My topdirs variable is set to ./. It looks like the issue relates to
  relative paths only.
  
   Furthermore, additional args have always been ignored by recollindex if
   neither -i nor -e was set. I just checked that this was true of 1.17
   
   Did you actually mean to type -i some/path in the above report ?
  
  Probably. I tried both with -i and without -i. The behaviour for the new
  version would always be to only consider files below /tmp.
  
  Given the above I guess that my issue relates to recollindex having
  added a call to chdir that was not present in the old version:
  
  http://sources.debian.net/src/recoll/1.20.3-2/index/recollindex.cpp/#L403
  
  I am not sure whether the issue at hand is worth fixing at all or
  whether I just need to update my configuration in a suitable way.
  
  Possibly the value of topdirs could be canonicalized using realpath
  prior to invoking chdir? That's probably just the tip of the iceberg
  though.

I think that you have explained everything, and found a an error-checking +
doc bug.

Relative paths in topdirs, should either be banned or be made absolute, not
against the current directory, but against the configuration directory
itself.

You are right that the chdir() to /tmp is what put the problem in
evidence. The reason for doing this is that some of the helper programs
used by recoll have a tendancy to leave temporary files around, which is
best done in /tmp.

It's very weird that nobody ever hit this to this day... Even without the
chdir(), the fact that the data being indexed depended on the cd was quite
unsound (starting indexing from the wrong place would empty the index as I
just saw...). Otoh, as paths are made absolute during indexing (against the
current directory, both 1.17 and 1.20), search result preview/open would
work from any location.

All topdirs examples in the doc are with absolute paths, but there is
nothing that explicitely bans relative paths.

And the chdir has been in there for around 2 years, many people have run
with it.

10 years of recoll and still surprises...

So for the followup:

 - On your side, please use absolute paths inside recoll.conf. This is the
   workaround for now.

 - The next minor Recoll release will generate an error if relative paths
   are found in topdirs.

 - I am going to think a bit more about this. If confdir-relative paths can
   make sense in some circumstances (maybe they could have a use for
   removable media if I can make it work), I'll change the code to make it
   useful, but this is a big change, maybe in 1.21

If you can think of a good reason to keep relative paths against the
current directory working, I am quite interested by your observations.

In any case, thank you very much for tracking this out and finding the root
cause.

Cheers,

jf


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#784540: recollindex always indexes /tmp despite being given a different path

2015-05-06 Thread Helmut Grohne
Control: severity -1 normal
Control: retitle -1 topdirs no longer accepts relative paths

Thanks for your quick reply!

On Wed, May 06, 2015 at 04:42:02PM +0200, Jean-Francois Dockes wrote:
 The way to specify things to be indexed is to set the topdirs variable
 inside someconfdir/recoll.conf. This can be set by editing the file or from
 the indexing configuration section of the GUI.

 What is the value of this variable inside 'someconfdir' in the above test ?

My topdirs variable is set to ./. It looks like the issue relates to
relative paths only.

 Furthermore, additional args have always been ignored by recollindex if
 neither -i nor -e was set. I just checked that this was true of 1.17
 
 Did you actually mean to type -i some/path in the above report ?

Probably. I tried both with -i and without -i. The behaviour for the new
version would always be to only consider files below /tmp.

Given the above I guess that my issue relates to recollindex having
added a call to chdir that was not present in the old version:

http://sources.debian.net/src/recoll/1.20.3-2/index/recollindex.cpp/#L403

I am not sure whether the issue at hand is worth fixing at all or
whether I just need to update my configuration in a suitable way.

Possibly the value of topdirs could be canonicalized using realpath
prior to invoking chdir? That's probably just the tip of the iceberg
though.

Helmut


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#784540: recollindex always indexes /tmp despite being given a different path

2015-05-06 Thread Jean-Francois Dockes
Helmut Grohne writes:
  Package: recoll
  Version: 1.20.3-2
  Severity: important
  File: /usr/bin/recollindex
  
  recollindex -c someconfdir some/path used to index to index some/path.
  Since upgrading from 1.17.3-2 to 1.20.3-2 it indexes /tmp instead.
  
  It seems that the new version does not provide any means to index files
  outside /tmp. If this observation is wrong, please downgrade the
  severity of this bug.
  
  Helmut

The way to specify things to be indexed is to set the topdirs variable
inside someconfdir/recoll.conf. This can be set by editing the file or from
the indexing configuration section of the GUI.

What is the value of this variable inside 'someconfdir' in the above test ?

Furthermore, additional args have always been ignored by recollindex if
neither -i nor -e was set. I just checked that this was true of 1.17

Did you actually mean to type -i some/path in the above report ?

In this case, some/path still needs to be inside the indexed area.

I just also checked that this was already the case with  1.17

Test: recollindex 1.17.4, ~/projets is not in topdirs:

recollindex -c ~/.recoll -i ~/projets

Diagnostic in the log file:

:4:../index/fsindexer.cpp:168:FsIndexer::indexFiles: skipping 
[/home/dockes/projets] (ntd)

(ntd means not in top dirs).

So, at this point, I am at a loss to explain what you could do with 1.17
that you can not do with 1.20...


Regards,

Jean-Francois Dockes


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org