On Sat, Aug 18, 2012 at 10:27:28PM +0200, Vladimir Nadvornik wrote:

> http://www.mail-archive.com/geeqie-devel@lists.sourceforge.net/msg00356.html

1. i'd rather leave out the timestamp in the directory table.
   One still has to search the sub directories anyway. This information 
   does not help at all.

2. The metadata table: Is it for METADATA_PLAIN or METADATA_FORMATTED data?
   If the former: may i suggest a second table for formatted data? I'd rather
   not search for all the possible different methods to express F/5.6...

3. I got a few thousand RAW files where exif_get_metadata (likely the lower
   level functions to the same thing, too) returns an array for things which
   should just be a single value. (Exif.Image.BitsPerSample)
   To have some kind of meaningfull primary key i included an index field
   into the table (can't use "file_id,key,value" as the value was 8 for all
   three channels).
   This also handles duplicate keywords (Xmp.dc.subject) quite fine.

4. Right now the thing works as follows:
    - read some directory
    - recursive process subdirectories, come back.

    - start transaction
    - process the files in the directory
    - care for deleted file and subdirectories.
    - commit transaction.

    I use transactions as sqlite is dead slow if you don't (limited to ~60 
    single database changes per second), and the asyncronous mode, not 
    using fsync() and friends, just leads to corrupt databases if someone 
    uses ^C.

    The downside of this approach is that the database is locked during the
    transaction, which in a directory with 1000 images might take 2 minutes
    or more.

    I could change to use one transaction per file. This still has reasonable
    performance compared to the approach without transaction (but see below 
    for "reasonable") and wouldn't block the database for so long, but still 
    it would block for seconds.

5. If absolutely nothing has changed, my functions need 5.x seconds to read
   all the directories in ~/Bilder with it's 63000 images in 1900 directories.
   (4 core Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz, 8 GB, slow drive.)

6. Importing 15000 images takes about 40 minutes (with two metadata tables at
   the moment).
   As most of the time is spent inside sqlite and exif_get_metadata i don't
   see any way to speed this up by much.
   This is a bit long, as it right now is done at startup, before any user 
   input may be done.


7. Perfomance: 50 .JPG, 50 .ORF, into an already existing large database:

   - per directory transaction
   -- first import: 0:16 minutes:seconds
   -- re-import:    0:19 minutes:seconds

   - per file transaction
   -- first import: 2:24 minutes:seconds
   -- re-import:    2:40 minutes:seconds

   To me that means that the per directory transaction is the right thing
   to do, unless one needs to be able to react to user input in that time.

8. Performance Part 2:
    - deleting the metadata of 1000 files takes at *long* time (about 40
      seconds). Don't know why.

9. I tried the following to update the status display in the main window (which
   is visible at that time), but it didn't work.

        LayoutWindow *lw=NULL;
        if (!layout_valid(&lw))
            return;
        if (lw && !lw->info_status)
            return;
        /* we reach this point! */
        layout_status_update_info(lw,path);
        layout_status_update_progress(lw,val,"dir");
        layout_util_status_update_write(lw);

    Note that i do not make any other GTK calls somewhere.
    What else do i need to do?

9. This might be a good time to define an interface for the rest of 
   geeqie. Any ideas?

    /* to be used for files and directories */
    metadatadb_remove(const char *);
    metadatadb_add(const char *); /* update, too */
    metadatadb_rename(const char *from, const char *to);
    
    GList *metadatadb_get(const char *fname, const char *key);
    GList *metadatadb_get_filedata(FileData *fd, const char *key);
        /* = { return metadatadb_get(fd->original_path,key);  ? */

10. Must i assume that the legacy metadata stuff is still in use somewhere?

11. Right now i throw about 150 different exif / xmp tags into the database.
    This is fine for testing... but possibly somethat much otherwise.
    Is there a list of tags needed by or useful for geeqie?
    Shall this list be made configurable? If so: how?

Regards, Uwe

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Geeqie-devel mailing list
Geeqie-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/geeqie-devel

Reply via email to