At 14:33 15/01/2002 +0100, Sasa Janiska wrote: >On Today, -0000, Richard Barrett wrote: > >Hi Richard! >Thank you very much for your reply. > > > This is a straight htdig configuration issue. At the minimum you will have > > to add start_url directives to htdig's conf file for each of the list > > archives or ensure that links from one of the start_url directives in > > htdig's conf file eventually lead to each of the list archives. You will > > also have to have some sort of cron job to rebuild htdig's search indices > > regularly (probably daily) to include new archived material. > >That's easy. > > > The following patches can be applied to the mailman 2.0.8 (and earlier > > vesions of 2.0.x) to integrate htdig with Mailman and provide search of > > archives generated by the internal (pipermail) archiver. > >Do you have soemthing ready for V2.1?
I have already posted on sourceforge versions of the patch for MM 2.1a3 and MM2.1cvs. The latter is for the MM cvs at the date and time noted in the posting but this may need updating depending on what change in the CVS since my posting. It is my intention to publish a version of the patch for the beta and final versions of MM 2.1 as soon as I can after they are available. Just check sourceforge for Mailman patches 444879 and 444884 read the notes I post with each patch file. > > The patches are not of direct relevance if you have opted to use an > > external archiver. > >If pipermail can do the job, it isn't necessary. I am thinking about >external archiver seeing that pipermail is no longer maintained .. In the context of Mailman I think it can be said that pipermail is still being maintained. MM contains its own copy of pipermail code in python and if you search the developer archives you will see there is ongoing work and discussion about its future. The archiver will certainly be enhance by and maintained through MM 2.1 albeit the enhancements may not be that major. Do you do python? Maybe you could make a contribution! > > The benefit of the integration of htdig with Mailman archives generated by > > pipermail is that it provides per list search facilities with a search form > > on each list's archive TOC page and uses Mailman's security mechanism for > > limiting access to private mail archives via search responses; in fact you > > can only access a private list archive's search form if you are authorised > > to access the list. The patches also automatically builds htdig config > > files for each archived list and provides cron scripts for maintaining > > htdig's search indices. > >That's very important to limit access for private list archives. >Actually, only students should have access to the mailing lists, and >only for those courses they are enrolled in. If you go with the external archiver I guess you will have to apply authentication and access control via the web server used to access the archives produced. You may want to consider how you can automate keeping the access control data for each private list's archives, in a format for use by the web server, and the subscription information held by MM in step. As an aside, the htdig/MM integration I produced uses per list search forms embedded in the list archive TOC page in association with per list htdig config files and per list search indexes. The primary reason is that this gives user authentication before the search is done and inhibits unauthorised users having access to links and synopsis information which they are not entitled to access. The approach I adopted helps overcome a problem with having search indexes that contain information about both private and public data. If you have this you have to do one of following: 1. if you are serious about security, use your own search script to run the search engine's search and then filter the results returned by it to remove links and their associated synopsis information which the user is not authorised to see. The problem with this is that if you have a large search space then checking all the returned results is going to be demanding of system performance. 2. if you don't mind if people can read the snippets of data they are not authorised to see in the synopsis returned in association with each link you let the user see all the results returned. Having aroused their interest you then annoy by refusing to let them follow one of the links that the search just returned to them. My approach sidesteps both these issues reasonably neatly but I'm sure there are a dozen other ways of achieving the same objectives suing any combination of list manager/archive/search engine. >I'll definitely try with your suggestion. > >Since Pipermail is no longer developed, do you think about some patch >with external archivers like Mhonarc or Hypermail? I'm looking at producing a more generalised patch to simply producing closer integrations of other search engines with Mailman archives. I guess it might be worth expanding my thinking to generalise to mail archives produced by other archivers and searching them with different search engines. >Sincerely, >Sasa ------------------------------------------------------ Mailman-Users maillist - [EMAIL PROTECTED] http://mail.python.org/mailman/listinfo/mailman-users