This is an automated email from the ASF dual-hosted git repository.
git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/groovy-dev-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new d7e718b 2024/11/18 22:21:42: Generated dev website from
groovy-website@8275f32
d7e718b is described below
commit d7e718b11218bc65a860196464896047ecfdb089
Author: jenkins <[email protected]>
AuthorDate: Mon Nov 18 22:21:42 2024 +0000
2024/11/18 22:21:42: Generated dev website from groovy-website@8275f32
---
blog/groovy-lucene.html | 22 ++++++++++++++--------
1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/blog/groovy-lucene.html b/blog/groovy-lucene.html
index b081f7f..41a0c8b 100644
--- a/blog/groovy-lucene.html
+++ b/blog/groovy-lucene.html
@@ -100,6 +100,8 @@ so we won’t.</p>
| software
| projects
| https
+ | or
+ | prefixes
| technologies
)
)\w+ # end capture #2
@@ -134,16 +136,16 @@ var histogram = [:].withDefault { 0 }
new File(blogBaseDir).traverse(nameFilter: ~/.*\.adoc/) { file -> // <b
class="conum">(2)</b>
var m = file.text =~ tokenRegex // <b class="conum">(3)</b>
- var projects = m*.get(2).grep()*.toLowerCase()*.replaceAll('\n', '
').countBy() // <b class="conum">(4)</b>
- if (projects) {
- println "$file.name: $projects" // <b class="conum">(5)</b>
- projects.each { k, v -> histogram[k] += v } // <b
class="conum">(6)</b>
+ var projects = m*.get(2).grep()*.toLowerCase()*.replaceAll('\n', ' ') //
<b class="conum">(4)</b>
+ var counts = projects.countBy() // <b class="conum">(5)</b>
+ if (counts) {
+ println "$file.name: $counts" // <b class="conum">(6)</b>
+ counts.each { k, v -> histogram[k] += v } // <b
class="conum">(7)</b>
}
}
println()
-
-histogram.sort { e -> -e.value }.each { k, v -> // <b
class="conum">(7)</b>
+histogram.sort { e -> -e.value }.each { k, v -> // <b
class="conum">(8)</b>
var label = "$k ($v)"
println "${label.padRight(32)} ${bar(v, 0, 50, 50)}"
}</code></pre>
@@ -161,7 +163,10 @@ histogram.sort { e -> -e.value }.each { k, v -> //
<b class="conum">(7)</b
<p>We define our matcher</p>
</li>
<li>
-<p>This pulls out project names (capture group 2) and ignores other words
(using grep) then aggregates the hits for that file</p>
+<p>This pulls out project names (capture group 2), ignores other words (using
grep), converts to lowercase, and removes newlines for the case where a term
might span over the end of a line</p>
+</li>
+<li>
+<p>This aggregates the count hits for that file</p>
</li>
<li>
<p>We print out each blog post file name and its project references</p>
@@ -238,7 +243,8 @@ apache flink (1) █▏
<h2 id="_using_lucene">Using Lucene</h2>
<div class="sectionbody">
<div class="paragraph">
-<p>Okay, regular expressions weren’t that hard but in general we might
want to search many things.
+<p><span class="image right"><img
src="https://www.apache.org/logos/res/lucene/default.png" alt="lucene logo"
width="100"></span>
+Okay, regular expressions weren’t that hard but in general we might want
to search many things.
Search frameworks like Lucene help with that. Let’s see what it looks
like to apply
Lucene to our problem.</p>
</div>