I had a quick look. Seems like cl_demo is indeed broken (following a some
work Ben made on it a year ago), and we are working on it. As I said before,
try using CLucene for your needs as-is, and let us know if you hit any
walls.

The UTF8 test fails because of src\test\data\utf8text\french_utf8.txt. I
can't seem to commit it with -crlf -diff, so an extra LF is added and that
breaks the UTF8 code. This can be a code issue as well, but is less likely.

Itamar. 

> -----Original Message-----
> From: Klemens Friedl [mailto:fri...@gmail.com] 
> Sent: Sunday, June 13, 2010 6:21 PM
> To: clucene-developers@lists.sourceforge.net
> Subject: Re: [CLucene-dev] clucene - cl_demo stops with error
> 
> I tried to execute the cl_demo in both versions with an 
> reduced reuters corpa; see difference in the files:
> files-reuters_*.txt
> 
> the branch cl_demo version worked fine with less files, weird:
> branch_cl_demo_short.txt
> 
> the master version does work only with very few files, with a 
> bit more it shows another error:
> cl_demo_short_*.txt
> 
> 
> Klemens
> 
> 
> 2010/6/13 Klemens Friedl <fri...@gmail.com>:
> > cl_test apps run, but one test fails in both, and both 
> versions run a 
> > different amount of tests.
> >
> > cl_test (master) runs 97 tests, one of the two UTF8 tests 
> failed, see 
> > "cl_test.txt" file (attached last email).
> > cl_test (atomicthreads) runs 102 tests, also one of the two 
> UTF8 tests 
> > failed, see "branch_cl_test.txt" file.
> >
> >
> > cl_demo crashes in both. yesterday, I tried to test cl_demo 
> with only 
> > circa half of the documents of the reuters test directory, 
> and it run 
> > through fine. I played a bit around and it seems that 
> cl_demo crashes 
> > while indexing text files with a few kilobytes (files that 
> are a bit 
> > larger than the smallest text files in the directory). The index 
> > merging and optimizing process takes unusally (in my opinion) long 
> > time, as the index files are combined maybe a megabyte of 
> disc space.
> > weird.
> >
> >
> > 2010/6/13 Itamar Syn-Hershko <ita...@divrei-tora.com>:
> >> Just to confirm: for both branches, cl_test works fine but 
> cl_demo crashes?
> >>
> >>> -----Original Message-----
> >>> From: Klemens Friedl [mailto:fri...@gmail.com]
> >>> Sent: Sunday, June 13, 2010 5:37 PM
> >>> To: clucene-developers@lists.sourceforge.net
> >>> Subject: Re: [CLucene-dev] clucene - cl_demo stops with error
> >>>
> >>> I build and executed cl_test and cl_demo again on master and the 
> >>> atomicthreads branch with default cmake settings, see attached 
> >>> files.
> >>> I included a stack trace for the cl_demo app in both cases.
> >>>
> >>> Klemens
> >>>
> >>>
> >>> 2010/6/13 Itamar Syn-Hershko <ita...@divrei-tora.com>:
> >>> > Can you please test the master branch (cl_test and 
> cl_demo) with 
> >>> > default cmake settings as well?
> >>> >
> >>> > Also, can you send the stacktrace for this deadlock? If you
> >>> get this
> >>> > on master, then for master, otherwise for atomicthreads.
> >>> >
> >>> > Itamar.
> >>> >
> >>> >> -----Original Message-----
> >>> >> From: Klemens Friedl [mailto:fri...@gmail.com]
> >>> >> Sent: Sunday, June 13, 2010 11:16 AM
> >>> >> To: clucene-developers@lists.sourceforge.net
> >>> >> Subject: Re: [CLucene-dev] clucene - cl_demo stops with error
> >>> >>
> >>> >> I tried the cl_test and cl_demo with the atomicthreads 
> branch and 
> >>> >> default cmake settings (except added zlib path vars).
> >>> >> (see attached log files)
> >>> >>
> >>> >> cl_test runs through 102 tests, but fails on first of two
> >>> UTF8 tests.
> >>> >> cl_demo indexes all files of the reuters corpa, though it
> >>> deadlocks
> >>> >> right after that :/
> >>> >>
> >>> >>
> >>> >> Kind regards,
> >>> >> Klemens Friedl
> >>> >>
> >>> >>
> >>> >>
> >>> >> 
> F:\Home\Search\clucene\atomicthreads\build\bin\Debug>cl_test.exe
> >>> >> Key: .= pass N=not implemented F=fail All CLucene Tests:
> >>> >>     CLucene Atomic Updates Test:     ..              - 6203ms
> >>> >>     CLucene IndexReader Test:        ..              - 766ms
> >>> >>     CLucene Reuters Test:            ...             - 8547ms
> >>> >>     CLucene Analysis Test:           .               - 0ms
> >>> >>     CLucene Analyzers Test:          .........       - 234ms
> >>> >>     CLucene Document Test:           ......          - 4563ms
> >>> >>     CLucene Number Tools Test:       ...             - 422ms
> >>> >>     CLucene Debug Test:              .               - 0ms
> >>> >>     CLucene IndexWriter Test:        ......          - 4281ms
> >>> >>     CLucene IndexModifier Test:      .               - 56047ms
> >>> >>     CLucene High Frequencies Test:   .               - 16ms
> >>> >>     CLucene Priority Queue Test:     .               - 62ms
> >>> >>     CLucene DateTools Test:          ..              - 0ms
> >>> >>     CLucene Query Parser Test:       ............... - 63ms
> >>> >>     CLucene Multi-Field QP Test:     ..              - 0ms
> >>> >>     CLucene Boolean Tests:           ....            - 15ms
> >>> >>     CLucene Search Test:             ..............  - 609ms
> >>> >>     CLucene Queries Test:            ..              - 16ms
> >>> >>     CLucene Term Vector Test:        .....           - 78ms
> >>> >>     CLucene Sort Test:               ...........     - 79ms
> >>> >>     CLucene Duplicates Test:         ..              - 125ms
> >>> >>     CLucene DateFilter Test:         ...             - 78ms
> >>> >>     CLucene Wildcard Test:           ..              - 0ms
> >>> >>     CLucene Store Test:              ..              - 297ms
> >>> >>     CLucene UTF8 Test:               F.              - 187ms
> >>> >>
> >>> >> 102 tests run:  101 passed, 1 failed, 0 not implemented.
> >>> >>
> >>> >> Tests run in 82843ms
> >>> >>
> >>> >> WARNING: stringPool still contains intern'd strings 
> (refcounts):
> >>> >>  contents (10)
> >>> >>  field1 (5)
> >>> >>  field2 (5)
> >>> >>  field3 (5)
> >>> >>  field4 (5)
> >>> >>  id (4)
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >>
> >>> >> 
> F:\Home\Search\clucene\atomicthreads\build\bin\Debug>cl_demo.exe
> >>> >> adding file 1:
> >>> >> ..\src\test\data\reuters-21578/all-exchanges-strings.lc.txt
> >>> >> adding file 2:
> >>> ..\src\test\data\reuters-21578/all-orgs-strings.lc.txt
> >>> >> adding file 3:
> >>> >> ..\src\test\data\reuters-21578/all-people-strings.lc.txt
> >>> >> adding file 4:
> >>> >> ..\src\test\data\reuters-21578/all-places-strings.lc.txt
> >>> >> adding file 5:
> >>> >> ..\src\test\data\reuters-21578/all-topics-strings.lc.txt
> >>> >> adding file 6:
> >>> >> ..\src\test\data\reuters-21578/cat-descriptions_120396.txt
> >>> >> adding file 7:
> >>> >> 
> ..\src\test\data\reuters-21578/feldman-cia-worldfactbook-data.txt
> >>> >>
> >>> >> adding file 8: ..\src\test\data\reuters-21578/LEWIS.DTD
> >>> >> adding file 9: ..\src\test\data\reuters-21578/README.TXT
> >>> >> adding file 10: ..\src\test\data\reuters-21578/reut2-000.sgm
> >>> >> adding file 11: ..\src\test\data\reuters-21578/reut2-001.sgm
> >>> >> adding file 12: ..\src\test\data\reuters-21578/reut2-002.sgm
> >>> >> adding file 13: ..\src\test\data\reuters-21578/reut2-003.sgm
> >>> >> adding file 14: ..\src\test\data\reuters-21578/reut2-004.sgm
> >>> >> adding file 15: ..\src\test\data\reuters-21578/reut2-005.sgm
> >>> >> adding file 16: ..\src\test\data\reuters-21578/reut2-006.sgm
> >>> >> adding file 17: ..\src\test\data\reuters-21578/reut2-007.sgm
> >>> >> adding file 18: ..\src\test\data\reuters-21578/reut2-008.sgm
> >>> >> adding file 19: ..\src\test\data\reuters-21578/reut2-009.sgm
> >>> >> adding file 20: ..\src\test\data\reuters-21578/reut2-010.sgm
> >>> >> adding file 21: ..\src\test\data\reuters-21578/reut2-011.sgm
> >>> >> adding file 22: ..\src\test\data\reuters-21578/reut2-012.sgm
> >>> >> adding file 23: ..\src\test\data\reuters-21578/reut2-013.sgm
> >>> >> adding file 24: ..\src\test\data\reuters-21578/reut2-014.sgm
> >>> >> adding file 25: ..\src\test\data\reuters-21578/reut2-015.sgm
> >>> >> adding file 26: ..\src\test\data\reuters-21578/reut2-016.sgm
> >>> >> adding file 27: ..\src\test\data\reuters-21578/reut2-017.sgm
> >>> >> adding file 28: ..\src\test\data\reuters-21578/reut2-018.sgm
> >>> >> adding file 29: ..\src\test\data\reuters-21578/reut2-019.sgm
> >>> >> adding file 30: ..\src\test\data\reuters-21578/reut2-020.sgm
> >>> >> adding file 31: ..\src\test\data\reuters-21578/reut2-021.sgm
> >>> >>
> >>> >>
> >>> >> Debug Assertion Failed!
> >>> >> Expression: _BLOCK_TYPE_IS_VALID(pHead->nBlockUse)
> >>> >>
> >>> >> VS 2008 debugger reports a deadlock in:
> >>> >> atomicthreads\clucene\src\core\CLucene\util\Array.h (line 139)
> >>> >>
> >>> >>
> >>> >>
> >>> >> 2010/6/12 Klemens Friedl <fri...@gmail.com>:
> >>> >> > I forgot to mention that I ran the cl_test app 
> earlier today, 
> >>> >> > it stopped with an failure at test 97.
> >>> >> > (although, I may have used slightly different cmake settings)
> >>> >> >
> >>> >> > I will try out that branch tomorrow, as it's already 
> late there.
> >>> >> >
> >>> >> > Klemens
> >>> >> >
> >>> >> >
> >>> >> > 2010/6/12 Itamar Syn-Hershko <ita...@divrei-tora.com>:
> >>> >> >> I'm running cl_test on a similar environment without
> >>> any problem
> >>> >> >> (using the default CMake config). One of the tests there
> >>> >> indexes the
> >>> >> >> reuters corpus too. Can you try running that?
> >>> >> >>
> >>> >> >> The actual error looks like something we fixed in the
> >>> >> atomicthreads
> >>> >> >> branch, and wasn't merged into master yet due to lack
> >>> of feedback.
> >>> >> >> Can you try running demo from that branch (after trying
> >>> >> cl_test too)?
> >>> >> >>
> >>> >> >> Itamar.
> >>> >> >>
> >>> >> >>> -----Original Message-----
> >>> >> >>> From: Klemens Friedl [mailto:fri...@gmail.com]
> >>> >> >>> Sent: Saturday, June 12, 2010 10:19 PM
> >>> >> >>> To: clucene-developers@lists.sourceforge.net
> >>> >> >>> Subject: [CLucene-dev] clucene - cl_demo stops with error
> >>> >> >>>
> >>> >> >>> clucene - cl_demo stops with error while indexing
> >>> reuters corpus
> >>> >> >>>
> >>> >> >>> clucene version: current git current master
> >>> >> >>> platform: WinXP SP3
> >>> >> >>> build system: VS 2008 SP1
> >>> >> >>> cmake: 2.8.1
> >>> >> >>> cmake settings: see cmakecache.txt file (attached to email)
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> cl_demo app stops with error:
> >>> >> >>> (one code line changed only to meet path to reuters-21578 
> >>> >> >>> corpa
> >>> >> >>> directory)
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> F:\Home\Search\clucene\build\bin\Debug>cl_demo.exe
> >>> >> >>> adding file 1:
> >>> >> >>> src\test\data\reuters-21578/all-exchanges-strings.lc.txt
> >>> >> >>> adding file 2:
> >>> >> >>> src\test\data\reuters-21578/all-orgs-strings.lc.txt
> >>> >> >>> adding file 3:
> >>> >> src\test\data\reuters-21578/all-people-strings.lc.txt
> >>> >> >>> adding file 4:
> >>> >> src\test\data\reuters-21578/all-places-strings.lc.txt
> >>> >> >>> adding file 5:
> >>> >> src\test\data\reuters-21578/all-topics-strings.lc.txt
> >>> >> >>> adding file 6:
> >>> >> >>> src\test\data\reuters-21578/cat-descriptions_120396.txt
> >>> >> >>> adding file 7:
> >>> >> >>> 
> src\test\data\reuters-21578/feldman-cia-worldfactbook-data.tx
> >>> >> >>> t adding file 8: src\test\data\reuters-21578/LEWIS.DTD
> >>> >> >>> adding file 9: src\test\data\reuters-21578/README.TXT
> >>> >> >>> adding file 10: src\test\data\reuters-21578/reut2-000.sgm
> >>> >> >>> adding file 11: src\test\data\reuters-21578/reut2-001.sgm
> >>> >> >>>
> >>> >> >>> => VS 2008 SP1 debugger:
> >>> >> >>> Unhandled exception at 0x10099e4f (clucene-cored.dll) in
> >>> >> cl_demo.exe:
> >>> >> >>> 0xC0000005: Access violation writing location 0x01034f74.
> >>> >> >>>
> >>> >> >>> file:
> >>> >> >>> 
> clucene\src\core\CLucene\index\DocumentsWriterThreadState.cpp
> >>> >> >>> (line 642)
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> the lucene index file, (output from "dir" command):
> >>> >> >>>
> >>> >> >>> F:\Home\Search\clucene\build\bin\Debug\data>dir
> >>> >> >>>  Verzeichnis von 
> F:\Home\Search\clucene\build\bin\Debug\data
> >>> >> >>>
> >>> >> >>> 12.06.2010  20:56    <DIR>          .
> >>> >> >>> 12.06.2010  20:56    <DIR>          ..
> >>> >> >>> 12.06.2010  20:56                20 segments.gen
> >>> >> 12.06.2010  20:56
> >>> >> >>> 45 segments_3 12.06.2010  20:56                 0 
> write.lock 
> >>> >> >>> 12.06.2010  20:56           536.020 _0.cfs 12.06.2010
> >>> >> 20:58
> >>> >> >>> 114.688 _1.fdt 12.06.2010  20:56                 0 _1.fdx
> >>> >> >>>                6 Datei(en)        650.773 Bytes
> >>> >> >>>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> If I remove half of the reuters-21578 corpa files of the 
> >>> >> >>> corpa directory, the cl_demo runs through fine !!
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> I tried various settings with cmake. I am using
> >>> >> gnuwin32's zlib. I
> >>> >> >>> am not using iconv - as it appeared to me as optional
> >>> component.
> >>> >> >>> What are the prefered and tested cmake settings 
> for a common 
> >>> >> >>> environment?
> >>> >> >>> I need unicode support, multithreading would be a nice to
> >>> >> have, if
> >>> >> >>> possible i would like to avoid iconv.
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> Kind regards,
> >>> >> >>> Klemens Friedl
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> btw.
> >>> >> >>> the _LUCENE_THREAD_FUNC(atomicIndexTest, _writer) and 
> >>> >> >>> _LUCENE_THREAD_FUNC(atomicSearchTest, _directory) may
> >>> >> need a return
> >>> >> >>> statement, as VS informed me, while testing other
> >>> cmake settings.
> >>> >> >>> file: clucene\src\test\index\TestThreading.cpp 
> (line 18, 51)
> >>> >> >>>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >> >>
> >>> >>
> >>> 
> --------------------------------------------------------------------
> >>> -
> >>> >> >> --------- ThinkGeek and WIRED's GeekDad team up for the
> >>> Ultimate
> >>> >> >> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to 
> the lucky 
> >>> >> >> parental unit.  See the prize list and enter to win:
> >>> >> >> http://p.sf.net/sfu/thinkgeek-promo
> >>> >> >> _______________________________________________
> >>> >> >> CLucene-developers mailing list 
> >>> >> >> CLucene-developers@lists.sourceforge.net
> >>> >> >> 
> https://lists.sourceforge.net/lists/listinfo/clucene-developer
> >>> >> >> s
> >>> >> >>
> >>> >> >
> >>> >>
> >>> >
> >>> >
> >>> >
> >>> >
> >>> 
> --------------------------------------------------------------------
> >>> --
> >>> > -------- ThinkGeek and WIRED's GeekDad team up for the Ultimate 
> >>> > GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> >>> lucky parental
> >>> > unit.  See the prize list and enter to win:
> >>> > http://p.sf.net/sfu/thinkgeek-promo
> >>> > _______________________________________________
> >>> > CLucene-developers mailing list
> >>> > CLucene-developers@lists.sourceforge.net
> >>> > https://lists.sourceforge.net/lists/listinfo/clucene-developers
> >>> >
> >>>
> >>
> >>
> >>
> >> 
> ---------------------------------------------------------------------
> >> --------- ThinkGeek and WIRED's GeekDad team up for the Ultimate 
> >> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the lucky 
> >> parental unit.  See the prize list and enter to win:
> >> http://p.sf.net/sfu/thinkgeek-promo
> >> _______________________________________________
> >> CLucene-developers mailing list
> >> CLucene-developers@lists.sourceforge.net
> >> https://lists.sourceforge.net/lists/listinfo/clucene-developers
> >>
> >
> 



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
CLucene-developers mailing list
CLucene-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/clucene-developers

Reply via email to