Hi,

Nice work!

If you're comfortable hacking around in the Beagle codebase, you could
probably make the tokenizer (called an analyzer in Lucene parlance)
act more appropriately for code so that things like underscores aren't
stripped out.  Take a look at beagled/LuceneCommon.cs in the
BeagleAnalyzer class for more info.

Joe

On Wed, May 19, 2010 at 11:12 AM, Haojun Bao <baohao...@gmail.com> wrote:
> Hi, all
>
> Beagle put grep on steroid for me:-) Thanks a lot y'all beagle hackers!
>
> The idea is simple and practical, beagle-static-qeury first, then use
> grep on the results.
>
> For e.g., to grep "ENGLISH_STOP_WORDS" in the beagle source code, I will
> use beagle-static-query:
>
>    beagle-static-query\
>     --add-static-backend /src/beagle/.beagle\
>     --backend none\
>     --max-hits 100000\
>     'ENGLISH STOP WORDS'
>
> (note how I figured out the `_' character should be removed when
> beagling:-)
>
> Then I will only grep the original regexp target in the following files,
> because beagle already decided only these files contain all the 3 words
> of 'ENGLISH STOP WORDS':
>
>    /src/beagle/beagled/ExtractContent.cs
>    /src/beagle/beagled/LuceneCommon.cs
>    /src/beagle/beagled/Lucene.Net/Analysis/Standard/StandardAnalyzer.cs
>    /src/beagle/beagled/Lucene.Net/Analysis/StopAnalyzer.cs
>    
> /src/beagle/beagled/Snowball.Net/Lucene.Net/Analysis/Snowball/SnowballAnalyzer.cs
>    /src/beagle/NEWS
>
> This way, even with the ~2 gigabytes Andoid source code, you can usually grep
> and get the results in a few seconds. Best of all, it works not only
> with source code, but with any text files.
>
> If you are intested, the source code is at
>
>   git://github.com/baohaojun/windows-config.git
>
> And there's a detailed README at 
> http://github.com/baohaojun/windows-config/raw/master/gcode/beagle/beagle-grep-readme.org
>
>
>
>
>
>
>
> _______________________________________________
> dashboard-hackers mailing list
> dashboard-hackers@gnome.org
> http://mail.gnome.org/mailman/listinfo/dashboard-hackers
>
_______________________________________________
dashboard-hackers mailing list
dashboard-hackers@gnome.org
http://mail.gnome.org/mailman/listinfo/dashboard-hackers

Reply via email to