Hey Alejandro,

That was an awesome post. This is very very valuable information and a 
great way to handle an issue. Thank you for your contribution to the 
community here!

Best wishes,

Nick


On Friday, September 25, 2015 at 4:13:41 AM UTC-4, Alejandro Gonzalez wrote:
>
> Hello Nick, thanks for your response.
>
> The problem with the second approach is:
>
> - The majority of my Entities has Long IDs generated by the Datastore. The 
> auto-generated IDs are scattered IDs: 
> https://cloud.google.com/appengine/docs/java/datastore/entities#Java_Assigning_identifiers
>
> - When copying one entity from the frozen namespace to the new namespace *I 
> need to allocate its ID* in the new namespace, otherwise the  datastore 
> may generate one ID that is already in use for that Entity in the new 
> namespace. From the docs: 
>
>   System-allocated ID values are guaranteed unique to the entity group. If 
>> you copy an entity from one entity group or namespace to another and wish 
>> to preserve the ID part of the key, be sure to allocate the ID first to 
>> prevent Datastore from selecting that ID for a future assignment.
>
>
> - When trying to allocate a new ID (which is a scattered ID auto-generated 
> by the datastore) in the new namespace, I'm getting this error: 
> java.lang.IllegalArgumentException: 
> Exceeded maximum allocated IDs. Doing a search, that error seems to be 
> related with this post on stackoverflow: 
> http://stackoverflow.com/questions/32652316/exceeded-maximum-allocated-ids-exception-when-allocating-keyrange-appengine-o
>  
> (which explains fairly well my situation with the exception that I'm using 
> the low level Datastore API for the allocation) and its related to this 
> issue in the tracker: 
> https://code.google.com/p/googleappengine/issues/detail?id=11541
>
> So the problem is in the DatastoreService.allocateIdRange() function, when 
> trying to allocate scattered IDs. 
>
>
> Finally I found a workaround that works for me and avoids the bug in the 
> DatastoreService.allocateIdRange(). What I'm doing now is basically make 
> manipulate all the IDs I found (in Entity Keys and in properties that 
> reference a Key) the IDs to short them when copying the Entity to the new 
> Datastore:
>
>
> private void allocateIdForKey(Key entityK){
>     if( entityK.getId() > 0 ){
>         //the entity has an id and it must be allocated
>         //to avoid collisions in ids
>         KeyRange range = new KeyRange(
>                 entityK.getParent(), 
>                 entityK.getKind(),
>                 entityK.getId(), 
>                 entityK.getId());
>
>         //throws an exception if the Long ID in the Key is too big
>
> ds.allocateIdRange(range);
> }
> }
>
>
> //avoid https://code.google.com/p/googleappengine/issues/detail?id=11541
> //by cutting down the long ID if it is too big
> private Long getNewId( Long id ){
>     return id.toString().length() > 6 ?
>         new Double(Math.ceil(id/2)).longValue() :
>         id;
> }
>
> private Key getNewKey(Key key){
>     Boolean keyHasStringId = !StringUtil.isNullOrEmpty(key.getName());
>     if( !keyHasStringId ){ //LONG IDs
>
>         //short the ID if needed!! if we don't we'll get an Exceeded Maximum 
> allocated IDs
>
>         //exception
>
> Long newId = getNewId(key.getId()); 
> return KeyFactory.createKey(getNewKey(key.getParent()), key.getKind(), 
> newId);
> } else { //STRING IDs
> return KeyFactory.createKey(getNewKey(key.getParent()), key.getKind(), 
> key.getName());
> }
> }
>
> @Override
> public void map(Entity entity) {
> NamespaceManager.set(toNamespace);
> //change to the destination namespace, and create the new key for the 
> entity
> Key destinationKey = getNewKey(entity.getKey());
> Entity destinationEntity = new Entity(destinationKey);
> destinationEntity.setPropertiesFrom(entity);
>
> //check entity properties for keys to update them
> final Map<String, Object> properties = entity.getProperties();
> Set<String> propKeys = properties.keySet();
> for (String propKey : propKeys) {
> Object property = entity.getProperty(propKey);
> if( (property instanceof Key) ){
> destinationKey = getNewKey((Key) property);
> destinationEntity.setProperty(propKey, destinationKey);
> }
> }
>
> allocateIdForKey(destinationEntity.getKey());
> batcher.put(destinationEntity);
> }
>
>
>
> Thanks again for your time and effort to help the community!
>
>
> El miércoles, 23 de septiembre de 2015, 21:00:11 (UTC+2), Nick (Cloud 
> Platform Support) escribió:
>>
>> For the reason you mentioned, simulating each user action sequentially, 
>> incrementing global/sharded counters, triggering the creation of 
>> notification entities, etc. will not be the fastest way. A bulk-creation of 
>> all required data at a frozen state is definitely preferable, and a 
>> distributed task to read the data from a permanent, read-only namespace 
>> into a namespace created for the demo sounds like a great approach.
>>
>> At this point, however, we encounter an issue running MapReduce which I'm 
>> not sure I understand after reading your second paragraph. Are you storing 
>> serialized keys on any of your entities? I don't think there should be a 
>> problem saving entities with the same ID so long as they're in a different 
>> namespace. Maybe share the code for your MapReduce tasks.
>>
>> On Wednesday, September 23, 2015 at 6:51:14 AM UTC-4, Alejandro Gonzalez 
>> wrote:
>>>
>>> Hello,
>>>
>>> I've been struggling with the proper way to create a bunch of demo 
>>> contents "on the fly"...
>>>
>>> *This is the scenario*:
>>>
>>>    1. I have an application in which users can create an account before 
>>>    purchasing it.
>>>    2. When a user creates an account, he may choose to generate demo 
>>>    contents (to see the application in action and filled with data)
>>>    3. When a new demo account is created a lot of new entities needs to 
>>>    be created in the recently created namespace for that user.
>>>    
>>>
>>>
>>> *The first approach* was to use the application logic to generate all 
>>> the demo contents in the new namespace when the user request the demo, in a 
>>> task queue. Shortly after implementing this approach I realized that it was 
>>> very time/resource consuming. If i use the application logic, a single 
>>> action may produce more task and datastore updates (one user makes a +1 in 
>>> a publication, and that +1 generate notifications, user activity, 
>>> statistics, global counters, etc...).
>>>
>>>
>>> *The second approach* was to use mapreduce/pipelines to, basically, 
>>> copy a pre-generated demo's namespace into the new one. This approach 
>>> sounds much better, and takes significant less time and resources to 
>>> accomplish the task. The problem with this approach is that I start seeing 
>>> this error "java.lang.IllegalArgumentException: the id allocated for a 
>>> new entity was already in use, please try again". I just need to 
>>> re-allocate all IDs for all entities I'm copying, but most of them relies 
>>> on automatic-ids generation (scattered IDs). As per this issue I can't 
>>> reallocate scattered IDs (
>>> https://code.google.com/p/googleappengine/issues/detail?id=11541)(http://stackoverflow.com/questions/32652316/exceeded-maximum-allocated-ids-exception-when-allocating-keyrange-appengine-o).
>>>  
>>> There are entities in which may have sense to manually generate the ids 
>>> (legacy IDs) but there are other entities that this has no sense at all.
>>>
>>>
>>> My questions are:
>>> - Am I doing something wrong with the second approach? Should I 
>>> reconsider all my application ID generation just to be able to copy 
>>> entities between namespaces and allocate it's scattered IDs?
>>> - Should I focus on generate a demo with legacy IDs to use as a base 
>>> demo to copy to another namespace?
>>> - Am I missing a third approach? 
>>>
>>>
>>> Thanks in advance
>>>
>>>
>>>
>>>
>>>
>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"Google App Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/google-appengine.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/google-appengine/55f64c81-d7de-4210-ad75-83a5113b9dc0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to