Thanks you all to helped me to fix replication on Windows. Lvc@
On 11 July 2014 00:26, galina manashirova <[email protected]> wrote: > Yes. > Thanks to Luca this issue been fixed with latest 1.7.5 hot fix. > I was testing it all day today - several nodes on the same machine, > several nodes on different machines. Works great,as expected. Everything > gets replicated ! > Many thanks to Luca for fixing it in such a short time! > Chris, thank you for taking time to test in on Windows. And your > screencasts are great! > So, in case if anyone has any issues with replications - first try the > latest 1.7.5 Hot fix. > > -Galina > > > > > > > On Thursday, July 10, 2014 9:50:05 AM UTC-7, Lvc@ wrote: > >> Hi, >> I've just closed this issue as last for 1.7.5 before the release. >> >> Lvc@ >> >> >> >> On 10 July 2014 17:54, Chris Wilper <[email protected]> wrote: >> >>> Hi Galina, >>> >>> I finally got back to trying this in Windows and saw the exact same >>> error (the stack trace followed by "error on reading distributed request: >>> deploy_db". Then on a whim I searched the issues for windows and came up >>> with this: >>> >>> https://github.com/orientechnologies/orientdb/issues/2347 >>> >>> So I tried setting orientdb_home as suggested (using forward slashes), >>> and the final message "error on reading distributed request" no longer >>> occurs, and things continue as expected after that. I also noticed that >>> #2347 was just closed today, so it looks like the orientdb_home workaround >>> will no longer be necessary with 1.7.5. >>> >>> Note however that I still saw the stack trace. In fact, the same stack >>> trace occurs when running in Windows, Mac, and Linux, and creating a class >>> in a distributed configuration. On the surface, it doesn't appear to have a >>> negative consequence. I've reported it as a separate issue with a >>> screencast demo here: >>> >>> https://github.com/orientechnologies/orientdb/issues/2560 >>> >>> - Chris >>> >>> >>> >>> On Mon, Jul 7, 2014 at 7:04 PM, galina manashirova < >>> [email protected]> wrote: >>> >>>> Chirs; >>>> Thank you so much for screencast - great stuff. Helped a lot! >>>> I followed the same steps, but on windows machine. >>>> At the point when I created database People node1 throw Exception about >>>> node2 (see bellow) >>>> Database been created only on node1 , node 2 has only one JSON file. >>>> Is that the same issue you were able to fix by shutting VMWare? >>>> I don't think I have VMWare running anywhere. >>>> Does anyone know if there is another work around this problem? >>>> Using version 1.7.4. >>>> >>>> 2014-07-07 15:54:01:645 INFO Sent updated cluster configuration to the >>>> remote client 127.0.0.1:50895 [OClientConnectionManager]Exception in >>>> thread "hz >>>> ._hzInstance_1_orientdb.cached.thread-1" java.lang.NullPointerException >>>> at com.orientechnologies.orient.server.OClientConnection. >>>> getRemoteAddress(OClientConnection.java:68) >>>> at com.orientechnologies.orient.server. >>>> OClientConnectionManager.pushDistribCfg2Clients( >>>> OClientConnectionManager.java:257) >>>> at com.orientechnologies.orient.server.hazelcast. >>>> OHazelcastPlugin.entryUpdated(OHazelcastPlugin.java:575) >>>> at com.hazelcast.map.MapService.dispatchEvent(MapService.java: >>>> 906) >>>> at com.hazelcast.map.MapService.dispatchEvent(MapService.java: >>>> 70) >>>> at com.hazelcast.spi.impl.EventServiceImpl$ >>>> EventPacketProcessor.process(EventServiceImpl.java:509) >>>> at com.hazelcast.spi.impl.EventServiceImpl$ >>>> RemoteEventPacketProcessor.run(EventServiceImpl.java:535) >>>> at com.hazelcast.util.executor.StripedExecutor$Worker.run( >>>> StripedExecutor.java:142) >>>> at java.util.concurrent.ThreadPoolExecutor$Worker. >>>> runTask(ThreadPoolExecutor.java:895) >>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run( >>>> ThreadPoolExecutor.java:918) >>>> at java.lang.Thread.run(Thread.java:662) >>>> at com.hazelcast.util.executor.PoolExecutorThreadFactory$ >>>> ManagedThread.run(PoolExecutorThreadFactory.java:59) >>>> [node1]<-[node2] error on reading distributed request: deploy_db >>>> >>>> Thanks. >>>> -galina >>>> >>>> >>>> >>>> On Friday, July 4, 2014 12:28:40 AM UTC-7, Chris Wilper wrote: >>>> >>>>> Update: >>>>> >>>>> Ok, I haven't determined why I saw the odd behavior in Windows, but I >>>>> *have* been able to successfully set up multiple nodes w/replication on >>>>> OSX. After looking more carefully at the console output, I noticed on the >>>>> Mac that orient was binding to an unfamiliar IP address. It turns out it >>>>> was trying to connect via a virtual software network device (VMWare), and >>>>> I >>>>> believe this explains why I saw the odd behavior; after I shut down >>>>> vmware, >>>>> I was successful. >>>>> >>>>> Here is a screecast showing how I got it working with two nodes: >>>>> http://screencast.com/t/IiC5SIlUAk >>>>> >>>>> I basically created two empty nodes, then connected and created a >>>>> database and class, and added a record. It shows that the database was >>>>> definitely created on both nodes (the database directory), and that if one >>>>> node goes down, the other still provides access to the replicated record. >>>>> >>>>> One thing I realized in this process was that it seems the first node >>>>> you start on a given network device seems to have special status. I guess >>>>> it is the one responsible for communicating which nodes it knows are >>>>> available (including itself). So if you start node1, node2, and node3 all >>>>> on the same host in that order, you can shut down nodes 2 and 3 just fine, >>>>> but if you instead keep those running and try to shut down node1, you >>>>> can't >>>>> subsequently connect.However, if you restart any node, it will take over >>>>> the role that node1 had and you can then connect to the cluster again. At >>>>> least that's the behavior I think I'm observing. Does that sound right to >>>>> anybody familiar with this? Any way to get around it? >>>>> >>>>> Thanks, >>>>> Chris >>>>> >>>>> >>>>> On Thu, Jul 3, 2014 at 7:59 PM, galina manashirova < >>>>> [email protected]> wrote: >>>>> >>>>>> Another test of replication : >>>>>> >>>>>> 1. Started node1 >>>>>> 2. Started node2 >>>>>> Log file tells me that they are talking to each other. >>>>>> I logged to the database (from console) in node1. Created a new class >>>>>> : >>>>>> >>>>>> CREATE CLASS CUSTOMER EXTENDS V >>>>>> Nothing happened on node2. >>>>>> Since it is Master to Master replication shouldn't it replicate right >>>>>> away? >>>>>> I killed node1, then restarted node1 and only after that I could see >>>>>> my new CUSTOMER class on the console of node2. >>>>>> So, replication happens only if one of the nodes is going down? >>>>>> >>>>>> Is this expected behavior? >>>>>> >>>>>> -Galina >>>>>> >>>>>> >>>>>> On Thursday, July 3, 2014 2:30:51 PM UTC-7, Chris Wilper wrote: >>>>>> >>>>>>> Another data point: >>>>>>> >>>>>>> I just tried configuring replication with two nodes on the same host >>>>>>> with a fresh install of 1.7.4 on Windows and OSX, and I was also not >>>>>>> successful. But I saw different problems than you did. >>>>>>> >>>>>>> Steps I followed: >>>>>>> 1) Unpack the official distribution in two separate directories on >>>>>>> the same host, one for node1 and one for node2 >>>>>>> 2) Start node1 immediately by going into bin and running the >>>>>>> dserver script >>>>>>> 3) Modify node2's config/hazelcast.xml file, changing the port >>>>>>> element's value from 2434 to 2435 >>>>>>> 4) Start node2 >>>>>>> >>>>>>> After this, from the console output I could see that both nodes >>>>>>> recognized that they were part of the cluster and could see the other >>>>>>> one. >>>>>>> >>>>>>> But then I ran console.sh: >>>>>>> >>>>>>> orientdb> connect remote:localhost/GratefulDeadConcerts admin admin >>>>>>> >>>>>>> On Windows: >>>>>>> ------------------- >>>>>>> >>>>>>> It successfully connected, then showed me the DISTRIBUTED >>>>>>> CONFIGURATION, which looked correct. Then I ran a simple query (SELECT >>>>>>> COUNT(*) FROM V) successfully. Next, I tried stopping node2 to simulate >>>>>>> node failure. Queries still worked fine. Then I restarted node2, and >>>>>>> queries still worked as expected. Next, I tried stopping node1 and >>>>>>> suddenly >>>>>>> queries from the console failed with messages about not being able to >>>>>>> connect. Then I exited and restarted the console. Same problem. >>>>>>> Finally, I >>>>>>> decided to stop the other node, restart both nodes, and restart the >>>>>>> console. Immediately upon attempting to connect, I got the following: >>>>>>> >>>>>>> Connecting to database [remote:localhost/GratefulDeadConcerts] with >>>>>>> user 'admin'... >>>>>>> Error: >>>>>>> com.orientechnologies.orient.core.exception.OConfigurationException: >>>>>>> Database 'GratefulDeadConcerts' is not configured on server >>>>>>> (home=C:\Users\user >>>>>>> \Downloads\cluster\node1/databases/) >>>>>>> >>>>>>> Next I looked in the databases\GratefulDeadConcerts\ directory and >>>>>>> saw there was a single file in there, distributed-config.json, but no >>>>>>> data >>>>>>> files. For either node. Uh oh... >>>>>>> >>>>>>> On OS X: >>>>>>> -------------- >>>>>>> >>>>>>> It successfully connected, then said: >>>>>>> DISTRIBUTED CONFIGURATION: none (OrientDB is running in standalone >>>>>>> mode) >>>>>>> >>>>>>> ...even though the nodes seem to think they're running in >>>>>>> distributed mode. >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Can anyone else reproduce these behaviors with a fresh 1.7.4 install? >>>>>>> >>>>>>> Thanks, >>>>>>> Chris >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Jul 3, 2014 at 2:05 PM, galina manashirova < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Can anybody please help me with this or at least come up with a >>>>>>>> better tutorial in regards of replication. >>>>>>>> >>>>>>>> -Galina >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Wednesday, July 2, 2014 12:44:22 PM UTC-7, galina manashirova >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> Started from scratch: >>>>>>>>> 1. Downloaded version 1.7.4 >>>>>>>>> 2. Started server node1 in distributed mode (dserver) >>>>>>>>> 3. Copied node1 directory as node2 >>>>>>>>> 4. changed nodeName in orientdb-dserver-config.xml on both nodes >>>>>>>>> giving different names. >>>>>>>>> 5. Started node2 >>>>>>>>> Both nodes see each other. I see in the console for one node: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *Members [2] { Member [10.32.10.72]:2434 this Member >>>>>>>>> [10.32.10.72]:2435 }* >>>>>>>>> >>>>>>>>> And on the console of another node: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> *Members [2] { Member [10.32.10.72]:2434 Member >>>>>>>>> [10.32.10.72]:2435 this }* >>>>>>>>> >>>>>>>>> they are definitely talk to each other. Except one of the nodes >>>>>>>>> gave me an error: >>>>>>>>> >>>>>>>>> 2014-07-02 12:12:56:234 WARN [node2]->[[node1]] requesting deploy >>>>>>>>> of database 'GratefulDeadConcerts' on local server... >>>>>>>>> [OHazelcastPlugin] >>>>>>>>> 2014-07-02 12:32:56:266 WARN [node2] timeout (1200001ms) on >>>>>>>>> waiting for synchronous responses from nodes=[node1] responsesSoFar=[] >>>>>>>>> request=id=0 from=n >>>>>>>>> ode2 task=deploy_db [OHazelcastDistributedDatabase] >>>>>>>>> *Exception in thread "main" >>>>>>>>> com.orientechnologies.orient.server.distributed.ODistributedException: >>>>>>>>> E >>>>>>>>> rror on sending distributed request against database >>>>>>>>> 'GratefulDeadConcerts' >>>>>>>>> to nodes [node1]* >>>>>>>>> at com.orientechnologies.orient.server.hazelcast. >>>>>>>>> OHazelcastDistributedDatabase.send2Nodes(OHa >>>>>>>>> zelcastDistributedDatabase.java:194) >>>>>>>>> at com.orientechnologies.orient.server.hazelcast. >>>>>>>>> OHazelcastPlugin.sendRequest(OHazelcastPlugin.java:364) >>>>>>>>> at com.orientechnologies.orient.server.hazelcast. >>>>>>>>> OHazelcastPlugin.installDatabase(OHazelcastPlugin.java:813) >>>>>>>>> at com.orientechnologies.orient.server.hazelcast. >>>>>>>>> OHazelcastPlugin.installNewDatabases(OHazelcastPlugin.java:767) >>>>>>>>> at com.orientechnologies.orient.server.hazelcast. >>>>>>>>> OHazelcastPlugin.startup(OHazelcastPlugin.java:191) >>>>>>>>> at com.orientechnologies.orient.server.OServer. >>>>>>>>> registerPlugins(OServer.java:720) >>>>>>>>> at com.orientechnologies.orient.server.OServer.activate( >>>>>>>>> OServer.java:241) >>>>>>>>> at com.orientechnologies.orient.server.OServerMain.main( >>>>>>>>> OServerMain.java:32) >>>>>>>>> Caused by: com.orientechnologies.orient.server.distributed. >>>>>>>>> ODistributedException: No response received from any of nodes >>>>>>>>> [node1] for request id=0 from >>>>>>>>> =node2 task=deploy_db >>>>>>>>> at com.orientechnologies.orient.server.distributed. >>>>>>>>> ODistributedResponseManager.getFinalResponse( >>>>>>>>> ODistributedResponseManager.java:395) >>>>>>>>> at com.orientechnologies.orient.server.hazelcast. >>>>>>>>> OHazelcastDistributedDatabase.waitForResponse( >>>>>>>>> OHazelcastDistributedDatabase.java:422) >>>>>>>>> at com.orientechnologies.orient.server.hazelcast. >>>>>>>>> OHazelcastDistributedDatabase.send2Nodes(OHa >>>>>>>>> zelcastDistributedDatabase.java:191) >>>>>>>>> ... 7 more >>>>>>>>> >>>>>>>>> >>>>>>>>> Even though right above that I see a log message saying that >>>>>>>>> GratefulDatabase distributed configuration sees 2 nodes: >>>>>>>>> >>>>>>>>> 2014-07-02 12:12:56:216 INFO updated distributed configuration for >>>>>>>>> database: GratefulDeadConcerts: >>>>>>>>> ---------- >>>>>>>>> { >>>>>>>>> "version":2, >>>>>>>>> "autoDeploy":true, >>>>>>>>> "hotAlignment":false, >>>>>>>>> "readQuorum":1, >>>>>>>>> "writeQuorum":2, >>>>>>>>> "failureAvailableNodesLessQuorum":false, >>>>>>>>> "readYourWrites":true,"clusters":{ >>>>>>>>> "internal":null, >>>>>>>>> "index":null, >>>>>>>>> "*":{ >>>>>>>>> "servers":["<NEW_NODE>","node1","node2"] >>>>>>>>> } >>>>>>>>> } >>>>>>>>> } >>>>>>>>> When I try to add or remove something from one node on that >>>>>>>>> database nothing happens to another one. >>>>>>>>> Nothing gets replicated on database level. >>>>>>>>> Can someone please tell me what I am doing wrong? >>>>>>>>> I am not trying anything fancy with replication. This is just a >>>>>>>>> basic replication task. >>>>>>>>> I tried replication in some earlier versions (don't remember now >>>>>>>>> which one ) and it worked. Now I can't make it work. >>>>>>>>> We are trying to implement OrientDb for the one of our company >>>>>>>>> product and if replication is not going to work we would have to look >>>>>>>>> for >>>>>>>>> something else. >>>>>>>>> Please let me know if I am doing something wrong. >>>>>>>>> >>>>>>>>> Thank you. >>>>>>>>> -galina >>>>>>>>> >>>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> --- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "OrientDB" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to [email protected]. >>>>>>>> >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> >>>>>>> -- >>>>>> >>>>>> --- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "OrientDB" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to [email protected]. >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> >>>>> -- >>>> >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "OrientDB" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> >>> --- >>> You received this message because you are subscribed to the Google >>> Groups "OrientDB" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- > > --- > You received this message because you are subscribed to the Google Groups > "OrientDB" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- --- You received this message because you are subscribed to the Google Groups "OrientDB" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
