Re: [orientdb] Re: Replication doesn't work even for demo db - GratefulDeadConcerts - version 1.7.4

Luca Garulli Fri, 11 Jul 2014 03:50:31 -0700

Thanks you all to helped me to fix replication on Windows.

Lvc@




On 11 July 2014 00:26, galina manashirova <[email protected]>
wrote:

> Yes.
> Thanks to Luca this issue been fixed with latest 1.7.5 hot fix.
> I was testing it all day today - several nodes on the same machine,
> several nodes on different machines. Works great,as expected. Everything
>  gets replicated !
> Many thanks to Luca for fixing it in such a short time!
> Chris, thank you for taking time to test in on Windows. And your
> screencasts are great!
> So, in case if anyone has any issues with replications - first try the
> latest 1.7.5 Hot fix.
>
> -Galina
>
>
>
>
>
>
> On Thursday, July 10, 2014 9:50:05 AM UTC-7, Lvc@ wrote:
>
>> Hi,
>> I've just closed this issue as last for 1.7.5 before the release.
>>
>> Lvc@
>>
>>
>>
>> On 10 July 2014 17:54, Chris Wilper <[email protected]> wrote:
>>
>>> Hi Galina,
>>>
>>> I finally got back to trying this in Windows and saw the exact same
>>> error (the stack trace followed by "error on reading distributed request:
>>> deploy_db". Then on a whim I searched the issues for windows and came up
>>> with this:
>>>
>>> https://github.com/orientechnologies/orientdb/issues/2347
>>>
>>> So I tried setting orientdb_home as suggested (using forward slashes),
>>> and the final message "error on reading distributed request" no longer
>>> occurs, and things continue as expected after that. I also noticed that
>>> #2347 was just closed today, so it looks like the orientdb_home workaround
>>> will no longer be necessary with 1.7.5.
>>>
>>> Note however that I still saw the stack trace. In fact, the same stack
>>> trace occurs when running in Windows, Mac, and Linux, and creating a class
>>> in a distributed configuration. On the surface, it doesn't appear to have a
>>> negative consequence. I've reported it as a separate issue with a
>>> screencast demo here:
>>>
>>> https://github.com/orientechnologies/orientdb/issues/2560
>>>
>>> - Chris
>>>
>>>
>>>
>>> On Mon, Jul 7, 2014 at 7:04 PM, galina manashirova <
>>> [email protected]> wrote:
>>>
>>>> Chirs;
>>>> Thank you so much for screencast - great stuff. Helped a lot!
>>>> I followed the same steps, but on windows machine.
>>>> At the point when I created database People node1 throw Exception about
>>>> node2 (see bellow)
>>>> Database been created only on node1 , node 2 has only one JSON file.
>>>> Is that the same issue you were able to fix by shutting VMWare?
>>>> I don't think I have VMWare running anywhere.
>>>> Does anyone know if there is another work around this problem?
>>>> Using version 1.7.4.
>>>>
>>>> 2014-07-07 15:54:01:645 INFO Sent updated cluster configuration to the
>>>> remote client 127.0.0.1:50895 [OClientConnectionManager]Exception in
>>>> thread "hz
>>>> ._hzInstance_1_orientdb.cached.thread-1" java.lang.NullPointerException
>>>>         at com.orientechnologies.orient.server.OClientConnection.
>>>> getRemoteAddress(OClientConnection.java:68)
>>>>         at com.orientechnologies.orient.server.
>>>> OClientConnectionManager.pushDistribCfg2Clients(
>>>> OClientConnectionManager.java:257)
>>>>         at com.orientechnologies.orient.server.hazelcast.
>>>> OHazelcastPlugin.entryUpdated(OHazelcastPlugin.java:575)
>>>>         at com.hazelcast.map.MapService.dispatchEvent(MapService.java:
>>>> 906)
>>>>         at com.hazelcast.map.MapService.dispatchEvent(MapService.java:
>>>> 70)
>>>>         at com.hazelcast.spi.impl.EventServiceImpl$
>>>> EventPacketProcessor.process(EventServiceImpl.java:509)
>>>>         at com.hazelcast.spi.impl.EventServiceImpl$
>>>> RemoteEventPacketProcessor.run(EventServiceImpl.java:535)
>>>>         at com.hazelcast.util.executor.StripedExecutor$Worker.run(
>>>> StripedExecutor.java:142)
>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.
>>>> runTask(ThreadPoolExecutor.java:895)
>>>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>>>> ThreadPoolExecutor.java:918)
>>>>         at java.lang.Thread.run(Thread.java:662)
>>>>         at com.hazelcast.util.executor.PoolExecutorThreadFactory$
>>>> ManagedThread.run(PoolExecutorThreadFactory.java:59)
>>>> [node1]<-[node2] error on reading distributed request: deploy_db
>>>>
>>>> Thanks.
>>>> -galina
>>>>
>>>>
>>>>
>>>> On Friday, July 4, 2014 12:28:40 AM UTC-7, Chris Wilper wrote:
>>>>
>>>>> Update:
>>>>>
>>>>> Ok, I haven't determined why I saw the odd behavior in Windows, but I
>>>>> *have* been able to successfully set up multiple nodes w/replication on
>>>>> OSX. After looking more carefully at the console output, I noticed on the
>>>>> Mac that orient was binding to an unfamiliar IP address. It turns out it
>>>>> was trying to connect via a virtual software network device (VMWare), and 
>>>>> I
>>>>> believe this explains why I saw the odd behavior; after I shut down 
>>>>> vmware,
>>>>> I was successful.
>>>>>
>>>>> Here is a screecast showing how I got it working with two nodes:
>>>>> http://screencast.com/t/IiC5SIlUAk
>>>>>
>>>>> I basically created two empty nodes, then connected and created a
>>>>> database and class, and added a record. It shows that the database was
>>>>> definitely created on both nodes (the database directory), and that if one
>>>>> node goes down, the other still provides access to the replicated record.
>>>>>
>>>>> One thing I realized in this process was that it seems the first node
>>>>> you start on a given network device seems to have special status. I guess
>>>>> it is the one responsible for communicating which nodes it knows are
>>>>> available (including itself). So if you start node1, node2, and node3 all
>>>>> on the same host in that order, you can shut down nodes 2 and 3 just fine,
>>>>> but if you instead keep those running and try to shut down node1, you 
>>>>> can't
>>>>> subsequently connect.However, if you restart any node, it will take over
>>>>> the role that node1 had and you can then connect to the cluster again. At
>>>>> least that's the behavior I think I'm observing. Does that sound right to
>>>>> anybody familiar with this? Any way to get around it?
>>>>>
>>>>> Thanks,
>>>>> Chris
>>>>>
>>>>>
>>>>> On Thu, Jul 3, 2014 at 7:59 PM, galina manashirova <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Another test of replication :
>>>>>>
>>>>>> 1. Started node1
>>>>>> 2. Started node2
>>>>>> Log file tells me that they are talking to each other.
>>>>>> I logged to the database (from console) in node1. Created a new class
>>>>>> :
>>>>>>
>>>>>> CREATE CLASS CUSTOMER EXTENDS  V
>>>>>> Nothing happened on node2.
>>>>>> Since it is Master to Master replication shouldn't it replicate right
>>>>>> away?
>>>>>> I killed node1, then restarted node1 and only after that I could see
>>>>>> my new CUSTOMER class on the console of node2.
>>>>>> So, replication happens only if one of the nodes is going down?
>>>>>>
>>>>>> Is this expected behavior?
>>>>>>
>>>>>> -Galina
>>>>>>
>>>>>>
>>>>>> On Thursday, July 3, 2014 2:30:51 PM UTC-7, Chris Wilper wrote:
>>>>>>
>>>>>>> Another data point:
>>>>>>>
>>>>>>> I just tried configuring replication with two nodes on the same host
>>>>>>> with a fresh install of 1.7.4 on Windows and OSX, and I was also not
>>>>>>> successful. But I saw different problems than you did.
>>>>>>>
>>>>>>> Steps I followed:
>>>>>>>   1) Unpack the official distribution in two separate directories on
>>>>>>> the same host, one for node1 and one for node2
>>>>>>>   2) Start node1 immediately by going into bin and running the
>>>>>>> dserver script
>>>>>>>   3) Modify node2's config/hazelcast.xml file, changing the port
>>>>>>> element's value from 2434 to 2435
>>>>>>>   4) Start node2
>>>>>>>
>>>>>>> After this, from the console output I could see that both nodes
>>>>>>> recognized that they were part of the cluster and could see the other 
>>>>>>> one.
>>>>>>>
>>>>>>> But then I ran console.sh:
>>>>>>>
>>>>>>> orientdb> connect remote:localhost/GratefulDeadConcerts admin admin
>>>>>>>
>>>>>>> On Windows:
>>>>>>> -------------------
>>>>>>>
>>>>>>> It successfully connected, then showed me the DISTRIBUTED
>>>>>>> CONFIGURATION, which looked correct. Then I ran a simple query (SELECT
>>>>>>> COUNT(*) FROM V) successfully. Next, I tried stopping node2 to simulate
>>>>>>> node failure. Queries still worked fine. Then I restarted node2, and
>>>>>>> queries still worked as expected. Next, I tried stopping node1 and 
>>>>>>> suddenly
>>>>>>> queries from the console failed with messages about not being able to
>>>>>>> connect. Then I exited and restarted the console. Same problem. 
>>>>>>> Finally, I
>>>>>>> decided to stop the other node, restart both nodes, and restart the
>>>>>>> console. Immediately upon attempting to connect, I got the following:
>>>>>>>
>>>>>>> Connecting to database [remote:localhost/GratefulDeadConcerts] with
>>>>>>> user 'admin'...
>>>>>>> Error: 
>>>>>>> com.orientechnologies.orient.core.exception.OConfigurationException:
>>>>>>> Database 'GratefulDeadConcerts' is not configured on server
>>>>>>> (home=C:\Users\user
>>>>>>> \Downloads\cluster\node1/databases/)
>>>>>>>
>>>>>>> Next I looked in the databases\GratefulDeadConcerts\ directory and
>>>>>>> saw there was a single file in there, distributed-config.json, but no 
>>>>>>> data
>>>>>>> files. For either node. Uh oh...
>>>>>>>
>>>>>>> On OS X:
>>>>>>> --------------
>>>>>>>
>>>>>>> It successfully connected, then said:
>>>>>>> DISTRIBUTED CONFIGURATION: none (OrientDB is running in standalone
>>>>>>> mode)
>>>>>>>
>>>>>>> ...even though the nodes seem to think they're running in
>>>>>>> distributed mode.
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Can anyone else reproduce these behaviors with a fresh 1.7.4 install?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Chris
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Jul 3, 2014 at 2:05 PM, galina manashirova <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Can anybody please help me with this or at least come up with a
>>>>>>>> better tutorial in regards of replication.
>>>>>>>>
>>>>>>>> -Galina
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wednesday, July 2, 2014 12:44:22 PM UTC-7, galina manashirova
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Started from scratch:
>>>>>>>>> 1. Downloaded version 1.7.4
>>>>>>>>> 2. Started server node1 in distributed mode (dserver)
>>>>>>>>> 3. Copied node1 directory as node2
>>>>>>>>> 4. changed nodeName in orientdb-dserver-config.xml on both nodes
>>>>>>>>> giving different names.
>>>>>>>>> 5. Started node2
>>>>>>>>>     Both nodes see each other. I see in the console for one node:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Members [2] {        Member [10.32.10.72]:2434 this        Member
>>>>>>>>> [10.32.10.72]:2435    }*
>>>>>>>>>
>>>>>>>>>     And on the console of another node:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Members [2] {        Member [10.32.10.72]:2434        Member
>>>>>>>>> [10.32.10.72]:2435 this    }*
>>>>>>>>>
>>>>>>>>> they are definitely talk to each other. Except one of the nodes
>>>>>>>>> gave me an error:
>>>>>>>>>
>>>>>>>>> 2014-07-02 12:12:56:234 WARN [node2]->[[node1]] requesting deploy
>>>>>>>>> of database 'GratefulDeadConcerts' on local server... 
>>>>>>>>> [OHazelcastPlugin]
>>>>>>>>> 2014-07-02 12:32:56:266 WARN [node2] timeout (1200001ms) on
>>>>>>>>> waiting for synchronous responses from nodes=[node1] responsesSoFar=[]
>>>>>>>>> request=id=0 from=n
>>>>>>>>> ode2 task=deploy_db [OHazelcastDistributedDatabase]
>>>>>>>>> *Exception in thread "main"
>>>>>>>>> com.orientechnologies.orient.server.distributed.ODistributedException:
>>>>>>>>>  E
>>>>>>>>> rror on sending distributed request against database 
>>>>>>>>> 'GratefulDeadConcerts'
>>>>>>>>> to nodes [node1]*
>>>>>>>>>         at com.orientechnologies.orient.server.hazelcast.
>>>>>>>>> OHazelcastDistributedDatabase.send2Nodes(OHa
>>>>>>>>> zelcastDistributedDatabase.java:194)
>>>>>>>>>         at com.orientechnologies.orient.server.hazelcast.
>>>>>>>>> OHazelcastPlugin.sendRequest(OHazelcastPlugin.java:364)
>>>>>>>>>         at com.orientechnologies.orient.server.hazelcast.
>>>>>>>>> OHazelcastPlugin.installDatabase(OHazelcastPlugin.java:813)
>>>>>>>>>         at com.orientechnologies.orient.server.hazelcast.
>>>>>>>>> OHazelcastPlugin.installNewDatabases(OHazelcastPlugin.java:767)
>>>>>>>>>         at com.orientechnologies.orient.server.hazelcast.
>>>>>>>>> OHazelcastPlugin.startup(OHazelcastPlugin.java:191)
>>>>>>>>>         at com.orientechnologies.orient.server.OServer.
>>>>>>>>> registerPlugins(OServer.java:720)
>>>>>>>>>         at com.orientechnologies.orient.server.OServer.activate(
>>>>>>>>> OServer.java:241)
>>>>>>>>>         at com.orientechnologies.orient.server.OServerMain.main(
>>>>>>>>> OServerMain.java:32)
>>>>>>>>> Caused by: com.orientechnologies.orient.server.distributed.
>>>>>>>>> ODistributedException: No response received from any of nodes
>>>>>>>>> [node1] for request id=0 from
>>>>>>>>> =node2 task=deploy_db
>>>>>>>>>         at com.orientechnologies.orient.server.distributed.
>>>>>>>>> ODistributedResponseManager.getFinalResponse(
>>>>>>>>> ODistributedResponseManager.java:395)
>>>>>>>>>         at com.orientechnologies.orient.server.hazelcast.
>>>>>>>>> OHazelcastDistributedDatabase.waitForResponse(
>>>>>>>>> OHazelcastDistributedDatabase.java:422)
>>>>>>>>>         at com.orientechnologies.orient.server.hazelcast.
>>>>>>>>> OHazelcastDistributedDatabase.send2Nodes(OHa
>>>>>>>>> zelcastDistributedDatabase.java:191)
>>>>>>>>>         ... 7 more
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Even though right above that I see a log message saying that
>>>>>>>>> GratefulDatabase distributed configuration sees 2 nodes:
>>>>>>>>>
>>>>>>>>> 2014-07-02 12:12:56:216 INFO updated distributed configuration for
>>>>>>>>> database: GratefulDeadConcerts:
>>>>>>>>> ----------
>>>>>>>>> {
>>>>>>>>>   "version":2,
>>>>>>>>>   "autoDeploy":true,
>>>>>>>>>   "hotAlignment":false,
>>>>>>>>>   "readQuorum":1,
>>>>>>>>>   "writeQuorum":2,
>>>>>>>>>   "failureAvailableNodesLessQuorum":false,
>>>>>>>>>   "readYourWrites":true,"clusters":{
>>>>>>>>>     "internal":null,
>>>>>>>>>     "index":null,
>>>>>>>>>     "*":{
>>>>>>>>>   "servers":["<NEW_NODE>","node1","node2"]
>>>>>>>>> }
>>>>>>>>>     }
>>>>>>>>> }
>>>>>>>>> When I try to add or remove something from one node on that
>>>>>>>>> database nothing happens to another one.
>>>>>>>>> Nothing gets replicated on database level.
>>>>>>>>> Can someone please tell me what I am doing wrong?
>>>>>>>>> I am not trying anything fancy with replication. This is just a
>>>>>>>>> basic replication task.
>>>>>>>>> I tried replication in some earlier versions (don't remember now
>>>>>>>>> which one ) and it worked. Now I can't make it work.
>>>>>>>>> We are trying to implement OrientDb for the one of our company
>>>>>>>>> product and if replication is not going to work we would have to look 
>>>>>>>>> for
>>>>>>>>> something else.
>>>>>>>>> Please let me know if I am doing something wrong.
>>>>>>>>>
>>>>>>>>> Thank you.
>>>>>>>>> -galina
>>>>>>>>>
>>>>>>>>>
>>>>>>>>  --
>>>>>>>>
>>>>>>>> ---
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "OrientDB" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>>
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>>
>>>>>> ---
>>>>>> You received this message because you are subscribed to the Google
>>>>>> Groups "OrientDB" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>> send an email to [email protected].
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>>
>>>>>  --
>>>>
>>>> ---
>>>> You received this message because you are subscribed to the Google
>>>> Groups "OrientDB" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>
>>>  --
>>>
>>> ---
>>> You received this message because you are subscribed to the Google
>>> Groups "OrientDB" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>  --
>
> ---
> You received this message because you are subscribed to the Google Groups
> "OrientDB" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"OrientDB" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [orientdb] Re: Replication doesn't work even for demo db - GratefulDeadConcerts - version 1.7.4

Reply via email to