This is an automated email from the ASF dual-hosted git repository.
git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/groovy-dev-site.git
The following commit(s) were added to refs/heads/asf-site by this push:
new cff37b5 2024/12/21 22:22:51: Generated dev website from
groovy-website@1927aa2
cff37b5 is described below
commit cff37b50b89b35f7a275f6a3d06114a028381a0b
Author: jenkins <[email protected]>
AuthorDate: Sat Dec 21 22:22:51 2024 +0000
2024/12/21 22:22:51: Generated dev website from groovy-website@1927aa2
---
blog/groovy-lucene.html | 210 ++++++++++++++++++++++--------------------------
1 file changed, 97 insertions(+), 113 deletions(-)
diff --git a/blog/groovy-lucene.html b/blog/groovy-lucene.html
index a96420b..2ad3b93 100644
--- a/blog/groovy-lucene.html
+++ b/blog/groovy-lucene.html
@@ -138,13 +138,28 @@ are wanting to follow along and run these examples:</p>
<div class="colist arabic">
<ol>
<li>
-<p>You’d need to check out the Groovy website and point to it here</p>
+<p>You’d need to check out the Groovy website and point
<code>baseDir</code> to it here</p>
</li>
</ol>
</div>
<div class="paragraph">
-<p>Now our script will traverse all the files in that directory, processing
them with our regex
-and track the hits we find.</p>
+<p>First, let’s create a little helper method for printing a pretty
+graph of our results (we’ll use the <code>colorize</code> method from <a
href="https://github.com/dialex/JColor">JColor</a>):</p>
+</div>
+<div class="listingblock">
+<div class="content">
+<pre class="prettyprint highlight"><code data-lang="groovy">def
display(Map<String, Integer> data, int max, int scale = 1) {
+ data.each { k, v ->
+ var label = "$k ($v)"
+ var color = k.startsWith('apache') ? MAGENTA_TEXT() : BLUE_TEXT()
+ println "${label.padRight(32)} ${colorize(bar(v * scale, 0, max, max),
color)}"
+ }
+}</code></pre>
+</div>
+</div>
+<div class="paragraph">
+<p>Now our script will traverse all the files in that directory,
+processing them with our regex and track the hits we find.</p>
</div>
<div class="listingblock">
<div class="content">
@@ -161,10 +176,7 @@ new File(baseDir).traverse(nameFilter: ~/.*\.adoc/) { file
-> // <b class="c
}
println "\nFrequency of total hits mentioning a project:"
-histogram.sort { e -> -e.value }.each { k, v -> // <b
class="conum">(8)</b>
- var label = "$k ($v)"
- println "${label.padRight(32)} ${bar(v, 0, 50, 50)}"
-}</code></pre>
+display(histogram.sort { e -> -e.value }, 50) // <b
class="conum">(8)</b></code></pre>
</div>
</div>
<div class="colist arabic">
@@ -212,7 +224,7 @@ groovy-2-5-clibuilder-renewal.adoc: [apache commons
cli:2]
groovy-graph-databases.adoc: [apache age:11, apache hugegraph:3,
apache tinkerpop:3]
groovy-haiku-processing.adoc: [eclipse collections:3]
groovy-list-processing-cheat-sheet.adoc: [eclipse collections:4,
apache commons collections:3]
-groovy-lucene.adoc: [apache nutch:1, apache solr:1,
apache lucene:2, apache commons:1, apache commons math:2]
+groovy-lucene.adoc: [apache nutch:1, apache solr:1,
apache lucene:3, apache commons:4, apache commons math:2,
apache spark:1]
groovy-null-processing.adoc: [eclipse collections:6, apache commons
collections:4]
groovy-pekko-gpars.adoc: [apache pekko:4]
groovy-record-performance.adoc: [apache commons codec:1]
@@ -229,33 +241,33 @@ wordle-checker.adoc: [eclipse collections:3]
zipping-collections-with-groovy.adoc: [eclipse collections:4]
Frequency of total hits mentioning a project:
-eclipse collections (50)
██████████████████████████████████████████████████▏
-apache commons math (18) ██████████████████▏
-apache ignite (17) █████████████████▏
-apache spark (13) █████████████▏
-apache mxnet (12) ████████████▏
-apache wayang (11) ███████████▏
-apache age (11) ███████████▏
-eclipse deeplearning4j (8) ████████▏
-apache commons collections (7) ███████▏
-apache commons csv (6) ██████▏
-apache nlpcraft (5) █████▏
-apache pekko (4) ████▏
-apache hugegraph (3) ███▏
-apache tinkerpop (3) ███▏
-apache flink (2) ██▏
-apache commons cli (2) ██▏
-apache lucene (2) ██▏
-apache commons (2) ██▏
-apache opennlp (2) ██▏
-apache ofbiz (1) █▏
-apache beam (1) █▏
-apache commons numbers (1) █▏
-apache nutch (1) █▏
-apache solr (1) █▏
-apache commons codec (1) █▏
-apache commons io (1) █▏
-apache kie (1) █▏
+eclipse collections (50) <span
style="color:blue">██████████████████████████████████████████████████</span>▏
+apache commons math (18) <span
style="color:purple">██████████████████</span>▏
+apache ignite (17) <span
style="color:purple">█████████████████</span>▏
+apache spark (14) <span
style="color:purple">██████████████</span>▏
+apache mxnet (12) <span
style="color:purple">████████████</span>▏
+apache wayang (11) <span
style="color:purple">███████████</span>▏
+apache age (11) <span
style="color:purple">███████████</span>▏
+eclipse deeplearning4j (8) <span style="color:blue">████████</span>▏
+apache commons collections (7) <span
style="color:purple">███████</span>▏
+apache commons csv (6) <span style="color:purple">██████</span>▏
+apache nlpcraft (5) <span style="color:purple">█████</span>▏
+apache pekko (4) <span style="color:purple">████</span>▏
+apache hugegraph (3) <span style="color:purple">███</span>▏
+apache tinkerpop (3) <span style="color:purple">███</span>▏
+apache lucene (3) <span style="color:purple">███</span>▏
+apache flink (2) <span style="color:purple">██</span>▏
+apache commons cli (2) <span style="color:purple">██</span>▏
+apache commons (2) <span style="color:purple">██</span>▏
+apache opennlp (2) <span style="color:purple">██</span>▏
+apache ofbiz (1) <span style="color:purple">█</span>▏
+apache beam (1) <span style="color:purple">█</span>▏
+apache commons numbers (1) <span style="color:purple">█</span>▏
+apache nutch (1) <span style="color:purple">█</span>▏
+apache solr (1) <span style="color:purple">█</span>▏
+apache commons codec (1) <span style="color:purple">█</span>▏
+apache commons io (1) <span style="color:purple">█</span>▏
+apache kie (1) <span style="color:purple">█</span>▏
</pre>
</div>
</div>
@@ -374,19 +386,13 @@ println "\nFrequency of total hits mentioning a project
(top 10):"
var termFreq = terms.collectEntries { term ->
[term.text(), reader.totalTermFreq(term)] // <b class="conum">(3)</b>
}
-termFreq.sort(byReverseValue).take(10).each { k, v ->
- var label = "$k ($v)"
- println "${label.padRight(32)} ${bar(v, 0, 50, 50)}"
-}
+display(termFreq.sort(byReverseValue).take(10), 50)
println "\nFrequency of documents mentioning a project (top 10):"
var docFreq = terms.collectEntries { term ->
[term.text(), reader.docFreq(term)] // <b class="conum">(4)</b>
}
-docFreq.sort(byReverseValue).take(10).each { k, v ->
- var label = "$k ($v)"
- println "${label.padRight(32)} ${bar(v * 2, 0, 20, 20)}"
-}</code></pre>
+display(docFreq.sort(byReverseValue).take(10), 20, 2)</code></pre>
</div>
</div>
<div class="colist arabic">
@@ -422,7 +428,7 @@ groovy-2-5-clibuilder-renewal.adoc: [apache commons
cli:2]
groovy-graph-databases.adoc: [apache age:11, apache hugegraph:3,
apache tinkerpop:3]
groovy-haiku-processing.adoc: [eclipse collections:3]
groovy-list-processing-cheat-sheet.adoc: [apache commons collections:3,
eclipse collections:4]
-groovy-lucene.adoc: [apache commons:1, apache commons math:2,
apache lucene:2, apache nutch:1, apache solr:1]
+groovy-lucene.adoc: [apache commons:4, apache commons math:2,
apache lucene:3, apache nutch:1, apache solr:1,
apache spark:1]
groovy-null-processing.adoc: [apache commons collections:4,
eclipse collections:6]
groovy-pekko-gpars.adoc: [apache pekko:4]
groovy-record-performance.adoc: [apache commons codec:1]
@@ -439,28 +445,28 @@ wordle-checker.adoc: [eclipse collections:3]
zipping-collections-with-groovy.adoc: [eclipse collections:4]
Frequency of total hits mentioning a project (top 10):
-eclipse collections (50)
██████████████████████████████████████████████████▏
-apache commons math (17) █████████████████▏
-apache ignite (17) █████████████████▏
-apache spark (13) █████████████▏
-apache mxnet (12) ████████████▏
-apache wayang (11) ███████████▏
-apache age (11) ███████████▏
-eclipse deeplearning4j (8) ████████▏
-apache commons collections (7) ███████▏
-apache commons csv (6) ██████▏
+eclipse collections (50) <span
style="color:blue">██████████████████████████████████████████████████</span>▏
+apache commons math (17) <span
style="color:purple">█████████████████</span>▏
+apache ignite (17) <span
style="color:purple">█████████████████</span>▏
+apache spark (14) <span
style="color:purple">██████████████</span>▏
+apache mxnet (12) <span
style="color:purple">████████████</span>▏
+apache wayang (11) <span
style="color:purple">███████████</span>▏
+apache age (11) <span
style="color:purple">███████████</span>▏
+eclipse deeplearning4j (8) <span style="color:blue">████████</span>▏
+apache commons collections (7) <span
style="color:purple">███████</span>▏
+apache commons csv (6) <span style="color:purple">██████</span>▏
Frequency of documents mentioning a project (top 10):
-eclipse collections (10) ████████████████████▏
-apache commons math (7) ██████████████▏
-apache spark (5) ██████████▏
-apache ignite (4) ████████▏
-apache commons csv (4) ████████▏
-eclipse deeplearning4j (3) ██████▏
-apache wayang (3) ██████▏
-apache flink (2) ████▏
-apache commons collections (2) ████▏
-apache commons (2) ████▏
+eclipse collections (10) <span
style="color:blue">████████████████████</span>▏
+apache commons math (7) <span
style="color:purple">██████████████</span>▏
+apache spark (6) <span
style="color:purple">██████████</span>▏
+apache ignite (4) <span
style="color:purple">████████</span>▏
+apache commons csv (4) <span
style="color:purple">████████</span>▏
+eclipse deeplearning4j (3) <span style="color:blue">██████</span>▏
+apache wayang (3) <span style="color:purple">██████</span>▏
+apache flink (2) <span style="color:purple">████</span>▏
+apache commons collections (2) <span style="color:purple">████</span>▏
+apache commons (2) <span style="color:purple">████</span>▏
</pre>
<div class="paragraph">
@@ -581,10 +587,7 @@ results.scoreDocs.each { ScoreDoc scoreDoc -> // <b
class="conum">(3)</b>
}
println "\nFrequency of total hits mentioning a project (top 10):"
-histogram.sort { e -> -e.value }.take(10).each { k, v -> // <b
class="conum">(6)</b>
- var label = "$k ($v)"
- println "${label.padRight(32)} ${bar(v, 0, 50, 50)}"
-}</code></pre>
+display(histogram.sort { e -> -e.value }.take(10), 50) // <b
class="conum">(6)</b></code></pre>
</div>
</div>
<div class="colist arabic">
@@ -630,7 +633,7 @@ fun-with-obfuscated-groovy.adoc: [apache commons
math:1]
groovy-2-5-clibuilder-renewal.adoc: [apache commons cli:2]
groovy-graph-databases.adoc: [apache age:11, apache hugegraph:3,
apache tinkerpop:3]
groovy-haiku-processing.adoc: [eclipse collections:3]
-groovy-lucene.adoc: [apache nutch:1, apache solr:1,
apache lucene:2, apache commons:1, apache commons math:2]
+groovy-lucene.adoc: [apache nutch:1, apache solr:1,
apache lucene:3, apache commons:4, apache commons math:2,
apache spark:1]
groovy-pekko-gpars.adoc: [apache pekko:4]
groovy-record-performance.adoc: [apache commons codec:1]
handling-byte-order-mark-characters.adoc: [apache commons io:1]
@@ -645,16 +648,16 @@ wordle-checker.adoc: [eclipse collections:3]
zipping-collections-with-groovy.adoc: [eclipse collections:4]
Frequency of total hits mentioning a project (top 10):
-eclipse collections (50)
██████████████████████████████████████████████████▏
-apache commons math (18) ██████████████████▏
-apache ignite (17) █████████████████▏
-apache spark (13) █████████████▏
-apache mxnet (12) ████████████▏
-apache wayang (11) ███████████▏
-apache age (11) ███████████▏
-eclipse deeplearning4j (8) ████████▏
-apache commons collections (7) ███████▏
-apache commons csv (6) ██████▏
+eclipse collections (50) <span
style="color:blue">██████████████████████████████████████████████████</span>▏
+apache commons math (18) <span
style="color:purple">██████████████████</span>▏
+apache ignite (17) <span
style="color:purple">█████████████████</span>▏
+apache spark (14) <span
style="color:purple">█████████████</span>▏
+apache mxnet (12) <span
style="color:purple">████████████</span>▏
+apache wayang (11) <span
style="color:purple">███████████</span>▏
+apache age (11) <span
style="color:purple">███████████</span>▏
+eclipse deeplearning4j (8) <span style="color:blue">████████</span>▏
+apache commons collections (7) <span
style="color:purple">███████</span>▏
+apache commons csv (6) <span style="color:purple">██████</span>▏
</pre>
<div class="paragraph">
@@ -812,7 +815,7 @@ groovy-2-5-clibuilder-renewal.adoc: [apache commons
cli:2]
groovy-graph-databases.adoc: [apache age:11, apache hugegraph:3,
apache tinkerpop:3]
groovy-haiku-processing.adoc: [eclipse collections:3]
groovy-list-processing-cheat-sheet.adoc: [eclipse collections:4,
apache commons collections:3]
-groovy-lucene.adoc: [apache nutch:1, apache solr:1,
apache lucene:2, apache commons:1, apache commons math:2]
+groovy-lucene.adoc: [apache nutch:1, apache solr:1,
apache lucene:3, apache commons:4, apache commons math:2,
apache spark:1]
groovy-null-processing.adoc: [eclipse collections:6, apache commons
collections:4]
groovy-pekko-gpars.adoc: [apache pekko:4]
groovy-record-performance.adoc: [apache commons codec:1]
@@ -850,16 +853,10 @@ var projects = new
TaxonomyFacetIntAssociations('$projectHitCounts', taxonReader
var hitData = projects.getTopChildren(topN, 'projectHitCounts').labelValues
println "\nFrequency of total hits mentioning a project (top $topN):"
-hitData.each { m ->
- var label = "$m.label ($m.value)"
- println "${label.padRight(32)} ${bar(m.value, 0, 50, 50)}"
-}
+display(hitData.collectEntries{ lv -> [lv.label, lv.value] }, 50)
println "\nFrequency of documents mentioning a project (top $topN):"
-hitData.each { m ->
- var label = "$m.label ($m.count)"
- println "${label.padRight(32)} ${bar(m.count * 2, 0, 20, 20)}"
-}</code></pre>
+display(hitData.collectEntries{ lv -> [lv.label, lv.count] }, 20,
2)</code></pre>
</div>
</div>
<div class="paragraph">
@@ -867,33 +864,20 @@ hitData.each { m ->
</div>
<pre>
Frequency of total hits mentioning a project (top 5):
-eclipse collections (50)
██████████████████████████████████████████████████▏
-apache commons math (18) ██████████████████▏
-apache ignite (17) █████████████████▏
-apache spark (13) █████████████▏
-apache mxnet (12) ████████████▏
+eclipse collections (50) <span
style="color:blue">██████████████████████████████████████████████████</span>▏
+apache commons math (18) <span
style="color:purple">██████████████████</span>▏
+apache ignite (17) <span
style="color:purple">█████████████████</span>▏
+apache spark (14) <span
style="color:purple">██████████████</span>▏
+apache mxnet (12) <span
style="color:purple">████████████</span>▏
Frequency of documents mentioning a project (top 5):
-eclipse collections (10) ████████████████████▏
-apache commons math (7) ██████████████▏
-apache spark (5) ██████████▏
-apache ignite (4) ████████▏
-apache mxnet (1) ██▏
+eclipse collections (10) <span
style="color:blue">████████████████████</span>▏
+apache commons math (7) <span
style="color:purple">██████████████</span>▏
+apache ignite (4) <span
style="color:purple">████████</span>▏
+apache spark (6) <span
style="color:purple">████████████</span>▏
+apache mxnet (1) <span style="color:purple">██</span>▏
</pre>
-<div class="admonitionblock note">
-<table>
-<tr>
-<td class="icon">
-<div class="title">Note</div>
-</td>
-<td class="content">
-At the time of writing, there is a bug in sorting for the second of these
graphs.
-A <a href="https://github.com/apache/lucene/issues/14008">fix</a> is coming.
-</td>
-</tr>
-</table>
-</div>
<div class="paragraph">
<p>Now, the taxonomy information about document frequency is for the top hits
scored using the number of hits.
One of our other facets (<code>projectFileCounts</code>) tracks document
frequency independently.
@@ -917,7 +901,7 @@ Frequency of documents mentioning a project (top 5):
dim=projectFileCounts path=[] value=-1 childCount=27
eclipse collections (10)
apache commons math (7)
- apache spark (5)
+ apache spark (6)
apache ignite (4)
apache commons csv (4)
@@ -963,7 +947,7 @@ dim=projectNameCounts path=[] value=-1 childCount=2
Frequency of documents mentioning a project with path [apache] (top 5):
dim=projectNameCounts path=[apache] value=-1 childCount=18
commons (16)
- spark (5)
+ spark (6)
ignite (4)
wayang (3)
flink (2)