Hi Stefan, On 1 Sep 2010, at 10:44, Stefan Guggisberg wrote:
> On Wed, Sep 1, 2010 at 11:38 AM, Robin Wyles <[email protected]> wrote: >> An update on this... >> >> Stefan was indeed correct and it a charset/encoding issue that was causing >> Jackrabbit to ignore the existing repository content. > > thanks for the information. can you please provide more details about > the exact nature of the problem? > Sure, it seems that mysqldump has a habit of corrupting charsets other than latin1. We forced the use of latin1 using following commands to export/import our repository data: mysqldump -u username -p --default-character-set=latin1 -N database > backup.sql mysql -u username -p --default-character-set=latin1 database < backup.sql There's more info here: http://docforge.com/wiki/Mysqldump Even though this appears to work we're still unable to see any nt:file nodes whose binary data is stored in the datastore, I'm not sure whether this is a related or separate issue... Robin > cheers > stefan > >> >> However, now that I have manage to get our existing repository running under >> 2.1.0 I have a new problem and that is that all the nt:file nodes whose >> content is stored in the datastore (FileDataStore) are missing. The small >> nt:file nodes that are stored in the database are visible, just not those in >> the FileDataStore. >> >> When starting up our newly migrated repository for the first time I get a >> few "Record not found" datastore exceptions and some associated Tika >> exceptions for those missing datastore records - would those errors prevent >> the entire datastore from being used? The number of errors are far less than >> the 3000 or so items in the datastore, so it would suggest that it's either >> ignoring most of the datastore contents, or at start up at least they are >> recognised as valid. >> >> As before, once our repository has started I am able to add new nodes to the >> datastore, and these behave has expected. >> >> Any help, gratefully received - I'm really keen to get our repos onto 2.10 >> as some of its new query functionality is much needed! >> >> Robin >> >> >> >> >> On 27 Aug 2010, at 16:03, Robin Wyles wrote: >> >>> Hi Stefan >>> >>> On 27 Aug 2010, at 13:11, Stefan Guggisberg wrote: >>> >>>> On Fri, Aug 27, 2010 at 2:02 PM, Stefan Guggisberg >>>> <[email protected]> wrote: >>>>> On Fri, Aug 27, 2010 at 1:18 PM, Robin Wyles <[email protected]> >>>>> wrote: >>>>>> Hi Stefan, >>>>>> >>>>>> Thanks for your quick reply... >>>>>> >>>>>> On 27 Aug 2010, at 11:36, Stefan Guggisberg wrote: >>>>>> >>>>>>> hi robin, >>>>>>> >>>>>>> On Fri, Aug 27, 2010 at 11:25 AM, Robin Wyles <[email protected]> >>>>>>> wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> I'm having problems migrating an existing repository from Jackrabbit >>>>>>>> 1.6.0 to 2.1.0. >>>>>>>> >>>>>>>> Here are the steps I followed to test the migration: >>>>>>>> >>>>>>>> 1. Update app to use Jackrabbit 2.1.0, run unit tests etc. Manually >>>>>>>> test against empty 2.1.0 repository. All works fine here. Our >>>>>>>> repository configuration has not changed at all between versions. >>>>>>>> >>>>>>>> 2. Used mysqldump to export production repository. >>>>>>>> >>>>>>>> 3. Copy production repository directory (workspace folder, datastore, >>>>>>>> index folders etc.) to test machine. >>>>>>>> >>>>>>>> 4. Import SQL file from 2 above to new DB on test machine. >>>>>>>> >>>>>>>> 5. Start application on test machine. >>>>>>>> >>>>>>>> The result of the above is that the application starts up without >>>>>>>> error but that the repository appears empty. I am able to add new >>>>>>>> nodes to the repository, which behave correctly within the application >>>>>>>> yet none of the existing nodes are visible. I've tried xpath queries >>>>>>>> against known paths, e.g. "//library/*" and these return 0 nodes. >>>>>>>> >>>>>>>> A few things I've tried/noticed: >>>>>>>> >>>>>>>> 1. Repeating steps 3 and 4 above, then removing the old index >>>>>>>> directories before starting the application. Jackrabbit creates new >>>>>>>> lucene indexes, but they are very small, just like they would be when >>>>>>>> initialising an empty repository. Also, the index files are called >>>>>>>> indexes_2 rather than indexes as they were under 1.6.0. >>>>>>>> >>>>>>>> 2. When starting the app after the migration I notice that 4 extra >>>>>>>> records have been added to the BUNDLE table, 3 extra records are added >>>>>>>> to the VERSION_BUNDLE table and 2 extra records added to the >>>>>>>> VERSION_NAMES table. Again, this seems to be consistent with what is >>>>>>>> added automatically added to the database when a new repository is >>>>>>>> initialised. >>>>>>>> >>>>>>>> So, basically it appears that Jackrabbit is completely ignoring the >>>>>>>> existing repository data, and instead initialising a new repos using >>>>>>>> the existing database… >>>>>>>> >>>>>>>> If anyone has any ideas as to how I can get 2.1.0 to recognise our >>>>>>>> existing repository they'd be gratefully received - I feel there must >>>>>>>> be something simple I've overlooked! >>>>>>> >>>>>>> hmm, seems like the key values (i.e. the id format) has changed. >>>>>>> however, i am not aware of such a change. >>>>>>> maybe someone else knows more? >>>>>> >>>>>> The release notes for Jackrabbit 2.0.0 claim that it is backward >>>>>> compatible with 1.x repositories. I've seen a couple of messages on the >>>>>> users list relating to migration issues but these seem to involve custom >>>>>> nodetypes, whereas our repository has no custom nodetypes. >>>>>> >>>>>> How may I see what key values/ID format is used by the different >>>>>> versions? This sounds like quite a major change to me, and I'm sure >>>>>> something that would've been documented! >>>>> >>>>> absolutely. however, if you're saying that 4 extra records have been >>>>> inserted into the BUNDLE table >>>>> and the BUNDLE table already had n>=4 records, i can only explain it >>>>> with a changed binary representation >>>>> of the record id's. >>>>> >>>>> the 4 BUNDLE records are: >>>>> >>>>> / (root node) >>>>> /jcr:system >>>>> /jcr:system/jcr:nodeTypes >>>>> /jcr:system/jcr:versionStore >>>>> >>>>> the values of the ids those nodes are hard-coded in jackrabbit. >>>>> on startup, those nodes will be created if they don't exist. >>>>> >>>>> i am not a mysql expert. have you compared the configurations >>>>> of both mysql instances? maybe it's some strange charset/encoding >>>>> issue... >>> >>> Both mysql instances use the same charset/encoding, and all tables on both >>> instances are set to utf-8 for encoding and collation. >>> >>> The only difference between the two mysql instances are their version - >>> slightly older on our production machine. >>> >>> However, what you say makes sense - it really does look like Jackrabbit >>> can't find those nodes on start up which implies there's a charset/encoding >>> issue. >>> >>> I'm going to see if I can duplicate the database on our production mysql >>> instance and test against that... >>> >>>> >>>> or maybe it's a problem with the mysql indexes on those tables... >>>> >>> >>> I tried deleting the mysql indexes and recreating them, it didn't seem to >>> make any difference. >>> >>> Thanks, >>> >>> Robin >>> >>> >>> >>> >>> >> >> >>
