On Thursday, September 1, 2016 at 4:21:50 PM UTC-7, Jeremy C. Reed wrote: > > Sorry if I overlooked it, but I cannot find documentation or > explanation for the "Spam Filtering: User handling" webpage at > /admin/spamfilter/user. >
As far as I know that's not an oversight, I haven't seen any documentation. It would be great if we could write some. > My page currently shows: > > -=-=-=-=-=-=-=-=-= > Spam Filtering: User handling (data overview) 126893 entries > > Overview All Registered Unused > > There are 126893 different entries in the database, 125992 users are > registered and 125889 have not been used. > > Old user: __________ New user: __________ > > [Button: Change unauthorized user] > > [Button: Remove 44302 temporary sessions] > > [Button: Convert emails to registered usernames] > > -=-=-=-=-=-=-=-=-= > > I cannot use the All, Registered, or Unused links as they all cause the > webserver or webbrowser to time out. > The tables on those pages aren't paginated, so I'm not surprised the page load times out with so many entries. 1.0.9dev has a long history: https://trac.edgewall.org/log/plugins/1.0/spam-filter Therefore I bumped the version to 1.0.9 just now, and it would be good if you upgraded so we can be certain which version you are running. > What is the definition here of a "user"? What are these many entries? > A user is an authenticated entry in the session table, or a username (of an unathenticated session) that is used in a wiki edit, repository change, ticket edit, etc ... You might have usernames that don't map to authenticated entries if you allow anonymous edits of your site, or if you are connected to version control repositories with revision authors that don't have accounts in your system. When a user visits the site a session is created. For anonymous sessions the SID (session ID) is just a hash. If the user authenticates, the SID is the username. If the session saves a preference such as "days back" on the timeline, or username/email in prefs, then an entry is created in the session_attribute table. You can see info on the session and session_attribute tables here: https://trac.edgewall.org/wiki/TracDev/DatabaseSchema/Common#Tablesession > (My /admin/accounts/users lists only 163 users. Only a few are known > spammers and a few are test accounts.) > > What does it mean by 125889 users have not been used? Does this imply > that we had that many users over time? I don't believe that we removed > over 125,000 accounts. > Did you have a problem with spam registrations at sometime in the past? One possible explanation I see from looking at the code is that you have a lot of session_attribute entries that don't necessary match a session entry. I think we'll need to look at the sessions using TracAdmin to get more info. Try "trac-admin $env session list" and "trac-admin $env session list | wc -l". How many entries do you find? If you write those out into a spreadsheet and review them, we can count how many authenticated sessions there are, how many unauthenticated sessions there are, and maybe get some insight into the phantom registered accounts. TracAdmin's "session delete" can be used to remove entries you don't want. It's also highly possible there is a bug in SpamFilter. Some of the code quality of the plugin is rather suspect, and there aren't many tests. > I have been the primary Trac admin for the site for around 7 years. I > migrated the server a few times, upgraded Trac a few times, but the > content (wiki, tickets, user accounts) has never been reset. > > What are the Old user/New user fields and Change unauthorized user > button for? (Any example?) > >From the implementation, it looks like the feature is used to change a username. When a user adds a ticket comment the username (session id) is stored in the database. So if a username is to be changed, the entries have to be changed in many tables throughout the database. This is a shortcoming of Trac, and it would be ideal if the username wasn't used in the columns of so many tables, and something like a foreign key was used instead. If what I said is true, then the button is poorly named, and was probably named per a specific use-case that the author had in mind. > What does "Remove .... temporary sessions mean"? > That purges anonymous session, same as the "trac-admin session remove" command: https://trac.edgewall.org/wiki/TracAdmin We get many 10's of thousands of anonymous sessions on trac-hacks.org each week, presumably this is primarily due to bots and not actual users. What is "Convert emails to registered usernames" mean? > I guess it's used to convert email address to username, when a user has made an edit and used an email address rather than a username as a session id. This seems like a feature that isn't very useful for most sites. > Any way to access the mode=all, mode=authorized (Registered), or > mode=unused pages? Maybe I can restrict how many are displayed? (Sorry > I didn't read the source about this.) > We'll probably have to try to purge entries from the database first using TracAdmin. > What is the purpose and best practice of this /admin/spamfilter/user > interface? > To remove "unused" account from the database. I haven't looked at the specific conditions to label an account as unused, but it's something like: more than one year old, and has been inactive: no repository changes, ticket edits or wiki edits. It's probably mainly useful for public-facing sites. I've considered integrating SpamFilter into Trac, and now that I look closely at this user handling feature, if SpamFilter was integrated to Trac I think the "User handling" feature would probably be best as a separate plugin. Or, maybe as some of the features of AccountManager are integrated into Trac a feature like "Change unauthorized user" (i.e. rename username) could be added as an account management feature. Of course, if we fixed how usernames were stored in the tables it might not even be necessary. https://trac.edgewall.org/ticket/12398 > Maybe answers in this thread can supplement docs. > > This is Trac 1.0.12 installed from FreeBSD package. The > TracAccountManager-0.4.4 and TracSpamFilter-1.0.9 are installed from > subversion (then created egg and copied to plugins). > > (By the way, the FreeBSD package for trac-accountmanager-0.5.12583_1,1 > is incompatible due to: > 2016-08-31 16:57:51,588 Trac[loader] ERROR: Skipping > "spamfilter.registration = > tracspamfilter.filters.registration [account]": (version conflict > "VersionConflict: (TracAccountManager 0.5dev-r0 > (/usr/local/lib/python2.7/site-packages), > Requirement.parse('TracAccountManager>=0.4'))") > > I guess it didn't like the "dev-r0" part in the version check. I didn't > look at source code to workaround it. So I deinstalled package and > installed as mentioned above.) > I think that issue is fixed in SpamFilter and AccountManager, so the latest packages just need to be put into FreeBSD package management system: https://trac-hacks.org/changeset/14554/accountmanagerplugin https://trac.edgewall.org/changeset/14337/plugins/1.0/spam-filter You might talk with Greg about that: https://groups.google.com/d/msg/trac-dev/EUCPZYz0haQ/rNxuqqUUCAAJ Feel free to contact me directly by email if you would be willing to share the output of "session list", which you probably don't want to post publicly at least. We could follow-up here with the outcomes. - Ryan -- You received this message because you are subscribed to the Google Groups "Trac Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/trac-users. For more options, visit https://groups.google.com/d/optout.
