Hi Miranda I upgraded our sites from 3.0.1 to 3.5.2 a couple of weeks ago, and although I did not have problems exporting using the export servlets, I did have the same issues with line breaks.
The line breaks are caused by some XML formatting that I was not able to suppress. However I successfully worked around this problem by opening the exported XML in a text editor (ie Ultraedit) and replaced all line breaks with spaces. (you have to remove the Line break characters, not <br>..) After importing the "hacked" XML the Data was ok, also there was no problem with binary data like images or movies. However exporting also too ages in my system, besides assigning enough ram to tomcat, I just had to wait for the exports. Hth Claudio -----Original Message----- From: [email protected] [mailto:[EMAIL PROTECTED] Sent: Montag, 4. Februar 2008 06:40 To: [email protected] Subject: [magnolia-user] Need advice on 3.0.2 to 3.5.3 migration Hello all, We are going to be performing a somewhat emergency upgrade on our Magnolia 3.0.2 instance next week because we are experiencing a couple of database issues that are no longer acceptable to the client: 1. As I have mailed about several times, but not really received any suggestions as to how to remedy, our database has grown out of control. Our 50-60 page site has an authoring database over 1.2 GB at this time, and the public one is over 300 MB. The sheer size of these databases is being caused almost entirely by the version tables. The site has only been in operation for 8 months and sees fairly low editing activity most weeks, yet grows several hundred MBs a month. The nightly backups for these are chewing up disk space like candy. 2. The client is experiencing extremely long lag times from activating a page to seeing it in the Inbox, and more often than we would like experiences corruption in the workflow process (javax.jcr.RepositoryException: failed to retrieve item state of item...) that requires us to drop the Expressions and Store tables to clear it up before they are able to perform any page activations again. We have tried to rid ourselves unsuccessfully of these enormous version tables by following the instructions on the Magnolia documentation website for disabling versioning, thinking that once the versioning was turned off we could drop the versioning tables. This was unfortunately a bad idea and put our database in a bad state where nothing could be activated because there were missing nodes (presumably from the version tables we dropped... whoops), and we had to restore from a backup. We have decided to export our existing repositories to XML and hopefully rid ourselves of the workflow and versioning entirely since these are apparently disabled by default in 3.5.3. Our client has requested that we remove the workflow/versioning because they are causing more trouble than they are worth to the client. However, now we are faced with another problem that I've unsuccessfully in the past asked for advice about... the export process. I have never been able to successfully obtain an XML export from our site from the admin Tools -> Export page. We are able to export pages/page trees from the Website view fine, but whenever we use the Export page: 1. It takes hours to produce an export file that is only about 2-3 MBs in size. I tried to generate one on Friday on my local test server and had to kill the process after 3 hours without a file being produced. I am assuming, but have no real idea, that this is related to our 1.2 GB database size, even though I have not requested any version information be kept. The last time I tried to get an export, it took about 1.5 hours but at the time the database was about half the size. 2. Once we do obtain the XML export, all the line breaks are converted to <br> which produces completely messed up pages on import. In the past some users suggested we make sure that we do not have formatting selected as an export option, but we do always leave this blank and the XML is still formatted. I have seen several mentions in JIRA that something like this was supposedly fixed in 3.0 Final, but this is 3.0.2 we see this on. Right now my only option is really to go through the Website view and export each page tree individually, but I'd really like to do it the "right" way and get the whole repository at once. Does anyone have any suggestions on how we could actually export the whole website repository the correct way? Anything I can do to speed it up or make it be formatted correctly? If we upgrade to 3.0.5 first, will that help the export speed and/or the formatting problems? I am afraid to run the export process on the production site right now and possibly cause performance and/or memory issues. Thank you in advance for any advice on our exports! -- Miranda -- Miranda Jones Objective Consulting, Inc. http://www.spiders.com ---------------------------------------------------------------- for list details see http://documentation.magnolia.info/docs/en/editor/stayupdated.html ---------------------------------------------------------------- ---------------------------------------------------------------- for list details see http://documentation.magnolia.info/docs/en/editor/stayupdated.html ----------------------------------------------------------------
