Hi Stefan,

On 1 Sep 2010, at 10:44, Stefan Guggisberg wrote:

> On Wed, Sep 1, 2010 at 11:38 AM, Robin Wyles <[email protected]> wrote:
>> An update on this...
>> 
>> Stefan was indeed correct and it a charset/encoding issue that was causing 
>> Jackrabbit to ignore the existing repository content.
> 
> thanks for the information. can you please provide more details about
> the exact nature of the problem?
> 

Sure, it seems that mysqldump has a habit of corrupting charsets other than 
latin1. We forced the use of latin1 using following commands to export/import 
our repository data:

mysqldump -u username -p --default-character-set=latin1 -N database > backup.sql
mysql -u username -p --default-character-set=latin1 database < backup.sql

There's more info here:

http://docforge.com/wiki/Mysqldump

Even though this appears to work we're still unable to see any nt:file nodes 
whose binary data is stored in the datastore, I'm not sure whether this is a 
related or separate issue...

Robin

> cheers
> stefan
> 
>> 
>> However, now that I have manage to get our existing repository running under 
>> 2.1.0 I have a new problem and that is that all the nt:file nodes whose 
>> content is stored in the datastore (FileDataStore) are missing. The small 
>> nt:file nodes that are stored in the database are visible, just not those in 
>> the FileDataStore.
>> 
>> When starting up our newly migrated repository for the first time I get a 
>> few "Record not found" datastore exceptions and some associated Tika 
>> exceptions for those missing datastore records - would those errors prevent 
>> the entire datastore from being used? The number of errors are far less than 
>> the 3000 or so items in the datastore, so it would suggest that it's either 
>> ignoring most of the datastore contents, or at start up at least they are 
>> recognised as valid.
>> 
>> As before, once our repository has started I am able to add new nodes to the 
>> datastore, and these behave has expected.
>> 
>> Any help, gratefully received - I'm really keen to get our repos onto 2.10 
>> as some of its new query functionality is much needed!
>> 
>> Robin
>> 
>> 
>> 
>> 
>> On 27 Aug 2010, at 16:03, Robin Wyles wrote:
>> 
>>> Hi Stefan
>>> 
>>> On 27 Aug 2010, at 13:11, Stefan Guggisberg wrote:
>>> 
>>>> On Fri, Aug 27, 2010 at 2:02 PM, Stefan Guggisberg
>>>> <[email protected]> wrote:
>>>>> On Fri, Aug 27, 2010 at 1:18 PM, Robin Wyles <[email protected]> 
>>>>> wrote:
>>>>>> Hi Stefan,
>>>>>> 
>>>>>> Thanks for your quick reply...
>>>>>> 
>>>>>> On 27 Aug 2010, at 11:36, Stefan Guggisberg wrote:
>>>>>> 
>>>>>>> hi robin,
>>>>>>> 
>>>>>>> On Fri, Aug 27, 2010 at 11:25 AM, Robin Wyles <[email protected]> 
>>>>>>> wrote:
>>>>>>>> Hi,
>>>>>>>> 
>>>>>>>> I'm having problems migrating an existing repository from Jackrabbit 
>>>>>>>> 1.6.0 to 2.1.0.
>>>>>>>> 
>>>>>>>> Here are the steps I followed to test the migration:
>>>>>>>> 
>>>>>>>> 1. Update app to use Jackrabbit 2.1.0, run unit tests etc. Manually 
>>>>>>>> test against empty 2.1.0 repository. All works fine here. Our 
>>>>>>>> repository configuration has not changed at all between versions.
>>>>>>>> 
>>>>>>>> 2. Used mysqldump to export production repository.
>>>>>>>> 
>>>>>>>> 3. Copy production repository directory (workspace folder, datastore, 
>>>>>>>> index folders etc.) to test machine.
>>>>>>>> 
>>>>>>>> 4. Import SQL file from 2 above to new DB on test machine.
>>>>>>>> 
>>>>>>>> 5. Start application on test machine.
>>>>>>>> 
>>>>>>>> The result of the above is that the application starts up without 
>>>>>>>> error but that the repository appears empty. I am able to add new 
>>>>>>>> nodes to the repository, which behave correctly within the application 
>>>>>>>> yet none of the existing nodes are visible. I've tried xpath queries 
>>>>>>>> against known paths, e.g. "//library/*" and these return 0 nodes.
>>>>>>>> 
>>>>>>>> A few things I've tried/noticed:
>>>>>>>> 
>>>>>>>> 1. Repeating steps 3 and 4 above, then removing the old index 
>>>>>>>> directories before starting the application. Jackrabbit creates new 
>>>>>>>> lucene indexes, but they are very small, just like they would be when 
>>>>>>>> initialising an empty repository. Also, the index files are called 
>>>>>>>> indexes_2 rather than indexes as they were under 1.6.0.
>>>>>>>> 
>>>>>>>> 2. When starting the app after the migration I notice that 4 extra 
>>>>>>>> records have been added to the BUNDLE table, 3 extra records are added 
>>>>>>>> to the VERSION_BUNDLE table and 2 extra records added to the 
>>>>>>>> VERSION_NAMES table. Again, this seems to be consistent with what is 
>>>>>>>> added automatically added to the database when a new repository is 
>>>>>>>> initialised.
>>>>>>>> 
>>>>>>>> So, basically it appears that Jackrabbit is completely ignoring the 
>>>>>>>> existing repository data, and instead initialising a new repos using 
>>>>>>>> the existing database…
>>>>>>>> 
>>>>>>>> If anyone has any ideas as to how I can get 2.1.0 to recognise our 
>>>>>>>> existing repository they'd be gratefully received - I feel there must 
>>>>>>>> be something simple I've overlooked!
>>>>>>> 
>>>>>>> hmm, seems like the key values (i.e. the id format) has changed.
>>>>>>> however, i am not aware of such a change.
>>>>>>> maybe someone else knows more?
>>>>>> 
>>>>>> The release notes for Jackrabbit 2.0.0 claim that it is backward 
>>>>>> compatible with 1.x repositories. I've seen a couple of messages on the 
>>>>>> users list relating to migration issues but these seem to involve custom 
>>>>>> nodetypes, whereas our repository has no custom nodetypes.
>>>>>> 
>>>>>> How may I see what key values/ID format is used by the different 
>>>>>> versions? This sounds like quite a major change to me, and I'm sure  
>>>>>> something that would've been documented!
>>>>> 
>>>>> absolutely. however, if you're saying that 4 extra records have been
>>>>> inserted into the BUNDLE table
>>>>> and the BUNDLE table already had n>=4 records, i can only explain it
>>>>> with a changed binary representation
>>>>> of the record id's.
>>>>> 
>>>>> the 4 BUNDLE records are:
>>>>> 
>>>>> / (root node)
>>>>> /jcr:system
>>>>> /jcr:system/jcr:nodeTypes
>>>>> /jcr:system/jcr:versionStore
>>>>> 
>>>>> the values of the ids those nodes are hard-coded in jackrabbit.
>>>>> on startup, those nodes will be created if they don't exist.
>>>>> 
>>>>> i am not a mysql expert. have you compared the configurations
>>>>> of both mysql instances? maybe it's some strange charset/encoding
>>>>> issue...
>>> 
>>> Both mysql instances use the same charset/encoding, and all tables on both 
>>> instances are set to utf-8 for encoding and collation.
>>> 
>>> The only difference between the two mysql instances are their version - 
>>> slightly older on our production machine.
>>> 
>>> However, what you say makes sense - it really does look like Jackrabbit 
>>> can't find those nodes on start up which implies there's a charset/encoding 
>>> issue.
>>> 
>>> I'm going to see if I can duplicate the database on our production mysql 
>>> instance and test against that...
>>> 
>>>> 
>>>> or maybe it's a problem with the mysql indexes on those tables...
>>>> 
>>> 
>>> I tried deleting the mysql indexes and recreating them, it didn't seem to 
>>> make any difference.
>>> 
>>> Thanks,
>>> 
>>> Robin
>>> 
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 


Reply via email to