mikemccand commented on issue #61:
URL:
https://github.com/apache/lucene-jira-archive/issues/61#issuecomment-1193307444
I was curious about the `Lucene Fields`, so I wrote up a quick aggregator of
all populated fields on our Jira issues:
```
import os
import glob
import json
field_count = {}
votes_count = {}
for file_name in glob.glob('jira-dump/*.json'):
d = json.load(open(file_name))
votes = d['fields']['votes']['votes']
votes_count[votes] = 1+votes_count.get(votes, 0)
for field, value in d['fields'].items():
if value:
field_count[field] = 1 + field_count.get(field, 0)
for name, count in sorted(field_count.items(), key=lambda a: -a[1]):
print(f'{name}: {count}')
print('Votes:')
for name, count in sorted(votes_count.items(), key=lambda a: -a[1]):
print(f'{name}: {count}')
```
Output:
```
-*- mode: compilation; default-directory:
"/l/orig-lucene-jira-archive/migration/" -*-
Compilation started at Sun Jul 24 08:17:22
python print_custom_fields.py
customfield_12310420: 10645
priority: 10645
customfield_12313422: 10645
status: 10645
customfield_12310920: 10645
creator: 10645
reporter: 10645
aggregateprogress: 10645
progress: 10645
votes: 10645
worklog: 10645
issuetype: 10645
customfield_12314020: 10645
project: 10645
watches: 10645
created: 10645
updated: 10645
summary: 10645
comment: 10645
customfield_12311820: 10645
workratio: 10437
description: 10339
customfield_12310120: 9741
resolution: 8699
resolutiondate: 8699
fixVersions: 6945
attachment: 6703
components: 5779
assignee: 5769
versions: 3618
issuelinks: 2164
timetracking: 1515
aggregatetimespent: 1315
timespent: 1307
environment: 1022
labels: 611
customfield_10010: 437
parent: 333
aggregatetimeoriginalestimate: 208
aggregatetimeestimate: 208
timeestimate: 205
timeoriginalestimate: 205
subtasks: 79
customfield_12310250: 63
customfield_12313520: 32
customfield_12311020: 15
duedate: 8
customfield_12311024: 4
Votes:
0: 9802
1: 545
2: 142
3: 63
4: 25
5: 22
6: 10
8: 7
7: 6
12: 5
11: 4
9: 3
14: 2
10: 2
13: 1
22: 1
19: 1
28: 1
16: 1
36: 1
15: 1
```
I guess the Lucene fields are all of these `customer_N` fields ... but they
are heavily denormalized on export LOL. I'll try to sift through them.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]