Hi,
As I'm playing around with some production databases (copies thereof), I
noticed that we have a couple of operations which are inherently
unscalable, and therefore not really usable.
The most obvious one is EntityStore.visitEntityStates. It iterates
through an entire database and returns a EntityStoreUnitOfWork where all
entities have been registered. This simply doesn't work on large databases.
The reason we return a EntityStoreUnitOfWork is to be able to commit any
changes made as part of migration. I would suggest that we skip this,
and that instead the EntityStore itself internally do a write of an
entity that has been lazily migrated. Basically (MapEntityStore example):
public void visitEntity( Reader entityState )
throws ThrowableType
{
final EntityState entity = readEntityState( uow, entityState );
if (entity.status() == EntityStatus.UPDATED)
{
// Try to update in store
try
{
mapEntityStore.applyChanges( new MapEntityStore.MapChanges()
{
public void visitMap( MapEntityStore.MapChanger changer )
throws IOException
{
DefaultEntityState state = (DefaultEntityState) entity;
Writer writer = changer.updateEntity( state.identity(),
state.entityDescriptor().entityType() );
writeEntityState( state, writer, state.version(),
state.lastModified() );
writer.close();
}
} );
} catch (IOException e)
{
logger.warn( "Could not store migrated
entity:"+entity.identity(), e );
}
}
visitor.visitEntityState( entity );
// SKIP THIS! uow.registerEntityState( entity );
}
---
This would be scalable, as at no point is the entire database in memory,
as it is now. I would strongly suggest that this change is made.
I have a feeling that some of the API's/SPI's for indexing has similar
problems, as I get OOM exceptions when doing reindexing. I'll take a
look at that as well.
/Rickard
_______________________________________________
qi4j-dev mailing list
[email protected]
http://lists.ops4j.org/mailman/listinfo/qi4j-dev