Thanks Maarten that's actually the way it could be accomplished with the 
current code.  The TO DO list is quite extensive so I can't promise a date 
for this, but based on Maarten's script idea and adding 3 new options 
to the bulk_upload command this could be done automatically.

--dir_regex_metadata: to interpret a document's path as metadata values 
using regular expression 
--from_path: to import documents from a directory instead of a zipped file
--recursive: to traverse all subdirectories

so for something like:

/project number 01/customer/Customer Name A/
/project number 01/customer/Customer Name B/
/project number 02/customer/Customer Name C/

My regex Fu is poor but the command line would be more or less like this:

$ ./manage.py bulk_upload --noinput --dir_regex_metadata "/project number 
(?P<project_number>\d+)/customer/(?P<customer_name>[a-zA-Z0-9 
]+)"  --document_type "Accounting documents" --from_path 
/var/accounting/docs/ --recursive

with the regular expression parameter name (the name inside ?P<  >) being 
the internal metadata type name.

Thoughts?

--Roberto


On Friday, November 2, 2012 3:36:01 AM UTC-4, maarten wrote:
>
> Hi Roberto and James,
>  
> I agree on Roberto's statement on the filesystem. That is exactly the 
> reason why filesystem directory trees always end in a mess; as soon as more 
> than one directory (='metadata') can be applicable users will choose 
> randomly. In case of a DMS you can then just use a second, third metadata 
> tag.
> However, I do understand James' problem, very common in a transition, as 
> the best 'metadata' he currently has is the interpretation of the directory 
> tree. E.g. a structure "project number XX\sales\quotations" are in fact 3 
> metadata tags to be filled.
>  
> You could try a script where the arguments for -- metadata at bulk_upload 
> are filled by some logic from a ls command. I've never tried (and am really 
> not enough up to speed on pyhton if it wouldn't work to achieve this) but 
> this command with multiple --metadata arguments could then do the trick if 
> ran per subfolder:
>  
> $ ./manage.py bulk_upload --noinput --metadata '{"project": "bulk"}' 
> --document_type "Accounting documents" compressed.zip
>  
> After this the magick of indexes could rebuild the original directory 
> structure but then with many more usefull cross-sections of your document 
> metadata.
>  
> Maarten
>  
>
> On Wednesday, October 24, 2012 12:31:14 AM UTC+2, Roberto Rosario wrote:
>
>> No, the directory structure would not be 'cloned'.  I added this to the 
>> TO DO list for future versions, but I'm a little hesitant to add it, 
>> because it would just be duplicating the inefficient paradigm of filesystem 
>> directory trees only on a web interface.  This is the reason I created the 
>> automatic indexing where Mayan creates a hierarchical structure based on 
>> user defined rules to help users avoid being slaves of a manually updated 
>> structure as other DMS software do.  Is a little work at the beginning 
>> while you create the rules, but then you don't have to ever worry again 
>> about documents being placed in the correct hierarchical unit.
>>
>> It is not an acusasion, is it a document fact: 
>> http://news.cnet.com/8301-1023_3-57514677-93/corruption-in-wikiland-paid-pr-scandal-erupts-at-wikipedia/
>>  It 
>> has long been suspected that this was happening, the episode in the link is 
>> the most documented and alarming as it was done by a very senior Wikipedia 
>> editor.  The editor received payment to edit and favor the page of the 
>> government of Gibraltar so that it would be featured in Wikipedia's front 
>> page.  An article is lucky to land on the front page, the page of the 
>> country of Gibraltar landed 17 times, boosting their SEO results sky high. 
>>  What is most disgusting is how the editor involved and others argue that 
>> getting paid for favorably editing (or dumbing down a competitor's article) 
>> is not a conflict of interest!
>>
>> With this information in mind, now read Mayan's previous wikipedia 
>> article discussion here: 
>> http://en.wikipedia.org/wiki/Wikipedia:Articles_for_deletion/Mayan_edms  
>> It is extensive but you can clearly see how: 1) The article was tagged for 
>> deletion from the start even when articles are usually moved to an 
>> 'inactive' (userfied) mode where defenders can keep improving it at 
>> resubmit it for evaluation again, 2) The criteria for deletion was produced 
>> out of thin air, 3) The existing articles for commercial DMS software would 
>> fail that same criteria, 4) The editors were not following Wikipedia's code 
>> of conduct (accussing new users of being SPA's) , 5) Confused the issue on 
>> purpose, mixing defense of the article with 
>> WP:OTHERSTUFFEXISTS<http://en.wikipedia.org/wiki/Wikipedia:OTHERSTUFFEXISTS> 
>> to 
>> invalidate defense, 6) Editors had no idea what a DMS software is and 
>> confused it with CMS software; and tell me that it is hard to deny how the 
>> editors appeared to be personally motivated beyond their duties as editors 
>> to erradicate Mayan article from Wikipedia.
>>
>> I'm not trying to be controversial, just answering your argument and 
>> explaining my desires for not wanting/caring about a Mayan EDMS article on 
>> Wikipedia.
>>
>> --Roberto
>>
>>
>> On Tuesday, October 23, 2012 1:44:13 PM UTC-4, James Hondo wrote:
>>>
>>> We are a small accounting firm and have a Windows server working as 
>>> fileserver with all of our clients' documents sorted by year, month, 
>>> activities and such.  My question is; Can I import not only the documents, 
>>> but also their existing directory structure?
>>>
>>> Wow I don't always agree with the veteran editor's decision but calling 
>>> them corrupt is a very heavy handed and strong worded accusation.  Still I 
>>> think Mayan is a great piece of software with a great community and worthy 
>>> of an article in Wikipedia, just something to consider. 
>>>
>>> On Tuesday, October 23, 2012 3:35:11 AM UTC-4, Roberto Rosario wrote:
>>>>
>>>> Hi James,
>>>>
>>>> Thanks I apreciate your comments :)
>>>>
>>>> Check this thread to see if this is more or less what your are 
>>>> interested in: 
>>>> https://groups.google.com/forum/?fromgroups=&pli=1#!topic/mayan-edms/M_S5ZSVV5U4%5B1-25%5D
>>>>
>>>> As far as I know there are no Mayan EDMS articles on Wikipedia, there 
>>>> was one try once and the article got deleted by the most ridiculous of 
>>>> excuses, it became clear that the editors evaluating the article were 
>>>> seriously biased against Mayan for what I can only think were monetary 
>>>> reasons.  Wikipedia as an idea is great, but the project has fallen from 
>>>> grace, there are very serious moderation and vandalism issues that are as 
>>>> old as the project and that they have not been able to address.  I don't 
>>>> have any interest for an article about Mayan on Wikipedia.  Sorry if that 
>>>> sounds bit harsh since you are just offering to help, I just want to save 
>>>> you the time and effort of building and defending a great article only to 
>>>> have corrupt editors delete it once you comply with the self serving 
>>>> objections they will produce.  I wholeheartedly thank you for your 
>>>> interest, but it is not worth your time.
>>>>
>>>> --Roberto
>>>>
>>>>
>>>> On Monday, October 22, 2012 9:55:58 PM UTC-4, James Hondo wrote:
>>>>>
>>>>> Hello, thanks a lot for releasing your software, it is great!  I have 
>>>>> been looking for something like it for a long time, it does everything I 
>>>>> needed and then more.  One thing I couldn't found on the documentation; 
>>>>> can 
>>>>> it automatically mirror the structure of the document directories when 
>>>>> doing an initial import?  Also I noticed the Wikipedia article is missing 
>>>>> a 
>>>>> great deal of stuff, I've worked on a few articles myself and would 
>>>>> gladly 
>>>>> help polish Mayan's article if you like.
>>>>>
>>>>> James
>>>>>
>>>>

-- 



Reply via email to