Thanks Maarten that's actually the way it could be accomplished with the
current code. The TO DO list is quite extensive so I can't promise a date
for this, but based on Maarten's script idea and adding 3 new options
to the bulk_upload command this could be done automatically.
--dir_regex_metadata: to interpret a document's path as metadata values
using regular expression
--from_path: to import documents from a directory instead of a zipped file
--recursive: to traverse all subdirectories
so for something like:
/project number 01/customer/Customer Name A/
/project number 01/customer/Customer Name B/
/project number 02/customer/Customer Name C/
My regex Fu is poor but the command line would be more or less like this:
$ ./manage.py bulk_upload --noinput --dir_regex_metadata "/project number
(?P<project_number>\d+)/customer/(?P<customer_name>[a-zA-Z0-9
]+)" --document_type "Accounting documents" --from_path
/var/accounting/docs/ --recursive
with the regular expression parameter name (the name inside ?P< >) being
the internal metadata type name.
Thoughts?
--Roberto
On Friday, November 2, 2012 3:36:01 AM UTC-4, maarten wrote:
>
> Hi Roberto and James,
>
> I agree on Roberto's statement on the filesystem. That is exactly the
> reason why filesystem directory trees always end in a mess; as soon as more
> than one directory (='metadata') can be applicable users will choose
> randomly. In case of a DMS you can then just use a second, third metadata
> tag.
> However, I do understand James' problem, very common in a transition, as
> the best 'metadata' he currently has is the interpretation of the directory
> tree. E.g. a structure "project number XX\sales\quotations" are in fact 3
> metadata tags to be filled.
>
> You could try a script where the arguments for -- metadata at bulk_upload
> are filled by some logic from a ls command. I've never tried (and am really
> not enough up to speed on pyhton if it wouldn't work to achieve this) but
> this command with multiple --metadata arguments could then do the trick if
> ran per subfolder:
>
> $ ./manage.py bulk_upload --noinput --metadata '{"project": "bulk"}'
> --document_type "Accounting documents" compressed.zip
>
> After this the magick of indexes could rebuild the original directory
> structure but then with many more usefull cross-sections of your document
> metadata.
>
> Maarten
>
>
> On Wednesday, October 24, 2012 12:31:14 AM UTC+2, Roberto Rosario wrote:
>
>> No, the directory structure would not be 'cloned'. I added this to the
>> TO DO list for future versions, but I'm a little hesitant to add it,
>> because it would just be duplicating the inefficient paradigm of filesystem
>> directory trees only on a web interface. This is the reason I created the
>> automatic indexing where Mayan creates a hierarchical structure based on
>> user defined rules to help users avoid being slaves of a manually updated
>> structure as other DMS software do. Is a little work at the beginning
>> while you create the rules, but then you don't have to ever worry again
>> about documents being placed in the correct hierarchical unit.
>>
>> It is not an acusasion, is it a document fact:
>> http://news.cnet.com/8301-1023_3-57514677-93/corruption-in-wikiland-paid-pr-scandal-erupts-at-wikipedia/
>> It
>> has long been suspected that this was happening, the episode in the link is
>> the most documented and alarming as it was done by a very senior Wikipedia
>> editor. The editor received payment to edit and favor the page of the
>> government of Gibraltar so that it would be featured in Wikipedia's front
>> page. An article is lucky to land on the front page, the page of the
>> country of Gibraltar landed 17 times, boosting their SEO results sky high.
>> What is most disgusting is how the editor involved and others argue that
>> getting paid for favorably editing (or dumbing down a competitor's article)
>> is not a conflict of interest!
>>
>> With this information in mind, now read Mayan's previous wikipedia
>> article discussion here:
>> http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Mayan_edms
>> It is extensive but you can clearly see how: 1) The article was tagged for
>> deletion from the start even when articles are usually moved to an
>> 'inactive' (userfied) mode where defenders can keep improving it at
>> resubmit it for evaluation again, 2) The criteria for deletion was produced
>> out of thin air, 3) The existing articles for commercial DMS software would
>> fail that same criteria, 4) The editors were not following Wikipedia's code
>> of conduct (accussing new users of being SPA's) , 5) Confused the issue on
>> purpose, mixing defense of the article with
>> WP:OTHERSTUFFEXISTS<http://en.wikipedia.org/wiki/Wikipedia:OTHERSTUFFEXISTS>
>> to
>> invalidate defense, 6) Editors had no idea what a DMS software is and
>> confused it with CMS software; and tell me that it is hard to deny how the
>> editors appeared to be personally motivated beyond their duties as editors
>> to erradicate Mayan article from Wikipedia.
>>
>> I'm not trying to be controversial, just answering your argument and
>> explaining my desires for not wanting/caring about a Mayan EDMS article on
>> Wikipedia.
>>
>> --Roberto
>>
>>
>> On Tuesday, October 23, 2012 1:44:13 PM UTC-4, James Hondo wrote:
>>>
>>> We are a small accounting firm and have a Windows server working as
>>> fileserver with all of our clients' documents sorted by year, month,
>>> activities and such. My question is; Can I import not only the documents,
>>> but also their existing directory structure?
>>>
>>> Wow I don't always agree with the veteran editor's decision but calling
>>> them corrupt is a very heavy handed and strong worded accusation. Still I
>>> think Mayan is a great piece of software with a great community and worthy
>>> of an article in Wikipedia, just something to consider.
>>>
>>> On Tuesday, October 23, 2012 3:35:11 AM UTC-4, Roberto Rosario wrote:
>>>>
>>>> Hi James,
>>>>
>>>> Thanks I apreciate your comments :)
>>>>
>>>> Check this thread to see if this is more or less what your are
>>>> interested in:
>>>> https://groups.google.com/forum/?fromgroups=&pli=1#!topic/mayan-edms/M_S5ZSVV5U4%5B1-25%5D
>>>>
>>>> As far as I know there are no Mayan EDMS articles on Wikipedia, there
>>>> was one try once and the article got deleted by the most ridiculous of
>>>> excuses, it became clear that the editors evaluating the article were
>>>> seriously biased against Mayan for what I can only think were monetary
>>>> reasons. Wikipedia as an idea is great, but the project has fallen from
>>>> grace, there are very serious moderation and vandalism issues that are as
>>>> old as the project and that they have not been able to address. I don't
>>>> have any interest for an article about Mayan on Wikipedia. Sorry if that
>>>> sounds bit harsh since you are just offering to help, I just want to save
>>>> you the time and effort of building and defending a great article only to
>>>> have corrupt editors delete it once you comply with the self serving
>>>> objections they will produce. I wholeheartedly thank you for your
>>>> interest, but it is not worth your time.
>>>>
>>>> --Roberto
>>>>
>>>>
>>>> On Monday, October 22, 2012 9:55:58 PM UTC-4, James Hondo wrote:
>>>>>
>>>>> Hello, thanks a lot for releasing your software, it is great! I have
>>>>> been looking for something like it for a long time, it does everything I
>>>>> needed and then more. One thing I couldn't found on the documentation;
>>>>> can
>>>>> it automatically mirror the structure of the document directories when
>>>>> doing an initial import? Also I noticed the Wikipedia article is missing
>>>>> a
>>>>> great deal of stuff, I've worked on a few articles myself and would
>>>>> gladly
>>>>> help polish Mayan's article if you like.
>>>>>
>>>>> James
>>>>>
>>>>
--