This is most definitively a build issue. But "what" is unknown.
How comfortable are you playing with the source? If so,
I could guide you in debugging the issue. If not, I can make
a patch that you could build and run.
In short, the error is originating from here.
fs/ocfs2/cluster/tcp.c
if (sc->sc_page_off == sizeof(struct o2net_msg)) {
hdr = page_address(sc->sc_page);
if (be16_to_cpu(hdr->data_len) >
O2NET_MAX_PAYLOAD_BYTES)
ret = -EOVERFLOW;
}
Lukas Posadka wrote:
Hallo,
I have two servers and both are connected to external array, each by
own SAS connection. I need these servers to work simultaneously with
data on array and I think that ocfs2 is suitable for this purpose.
One server is P4 Xeon (Gentoo linux, i386, 2.6.22-r2) and second is
Opteron (Gentoo linux, x86_64, 2.6.22-r2). Servers are connected by
ethernet, adapters are both Intel EtherExpress1000.
Firstly, I compiled ocfs2 modules, which is provided with kernel and
then I downloaded and compiled ocfs2-tools (1.2.6). In accordance with
manual I made file cluster.conf, loaded modules and mounted /config
and /dlm. On both systems external array is /dev/sdb, so I made linux
partition /dev/sdb1 and ocfs2 filesystem on it. Cluster is started by
command
# /sbin/o2cb_ctl -H -n clust -t cluster -a online=yes
on both servers.
If I mount filesystem on one server, all is OK. I can read and write
files on filesystem and second server can see first's server
heartbeating.
-------------
serv_x86_64 # mounted.ocfs2 -f
Device FS Nodes
/dev/sdb1 ocfs2 serv_i386
-------------
serv_i386 # mounted.ocfs2 -f
Device FS Nodes
/dev/sdb1 ocfs2 serv_x86_64
-------------
Problem appears, when I try to mount filesystem on both servers. First
case is, when serv_i386 has filesystem mounted and serv_x86_64
attempts to mount filesystem too. After about 14 seconds appears this
message:
---------------
serv_x86_64 # mount -t ocfs2 /dev/sdb1 /ext_arrays/ds3200_1/
mount.ocfs2: Value too large for defined data type while mounting
/dev/sdb1 on /ext_arrays/ds3200_1/. Check 'dmesg' for more information
on this error.
---------------
In serv_x86_64's dmesg are following lines
----------------
ocfs2_dlm: Nodes in domain ("892E82953F2147A4BD75E2AAC5750BD3"): 1
o2net: connected to node serv_i386 (num 0) at 19X.XXX.69.194:7777
ocfs2_dlm: Nodes in domain ("892E82953F2147A4BD75E2AAC5750BD3"): 0 1
kjournald starting. Commit interval 5 seconds
(11637,3):ocfs2_broadcast_vote:434 ERROR: status = -75
(11637,3):ocfs2_do_request_vote:504 ERROR: status = -75
(11637,3):ocfs2_mount_volume:1117 ERROR: status = -75
(11637,3):ocfs2_broadcast_vote:434 ERROR: status = -75
(11637,3):ocfs2_do_request_vote:504 ERROR: status = -75
(11637,3):ocfs2_dismount_volume:1179 ERROR: status = -75
ocfs2: Unmounting device (8,17) on (node 1)
o2net: no longer connected to node serv_i386 (num 0) at
19X.XXX.69.194:7777
--------------------
and in serv_i386's these
--------------------
o2net: accepted connection from node serv_x86_64 (num 1) at
19X.XXX.69.196:7777
ocfs2_dlm: Node 1 joins domain 892E82953F2147A4BD75E2AAC5750BD3
ocfs2_dlm: Nodes in domain ("892E82953F2147A4BD75E2AAC5750BD3"): 0 1
ocfs2_dlm: Node 1 leaves domain 892E82953F2147A4BD75E2AAC5750BD3
ocfs2_dlm: Nodes in domain ("892E82953F2147A4BD75E2AAC5750BD3"): 0
o2net: no longer connected to node serv_x86_64 (num 1) at
19X.XXX.69.196:7777
----------------------
When I'm trying to connect servers conversely (first x86_64), then
mount stales and it is impossible to break it or unmount mounted
filesystem on other machine.Firewall is down, listings are at end of
this email.
Can anybody help me with this problem, please?
Thanks,
Lukas Posadka, CZ
serv_x86_64 ---------------------
# lsmod
Module Size Used by
ocfs2_dlmfs 20112 1
ocfs2 358632 0
ocfs2_dlm 187144 2 ocfs2_dlmfs,ocfs2
ocfs2_nodemanager 176072 6 ocfs2_dlmfs,ocfs2,ocfs2_dlm
configfs 25884 2 ocfs2_nodemanager
#dmesg
...
OCFS2 Node Manager 1.3.3
OCFS2 DLM 1.3.3
OCFS2 1.3.3
OCFS2 DLMFS 1.3.3
OCFS2 User DLM kernel interface loaded
...
# mount
...
none on /config type configfs (rw)
none on /dlm type ocfs2_dlmfs (rw)
# ls -l /config/
total 0
drwxr-xr-x 3 root root 0 Aug 23 00:20 cluster
....
----------------------
serv_i386
# lsmod
Module Size Used by
ocfs2_dlmfs 18824 1
ocfs2 378820 0
ocfs2_dlm 186756 2 ocfs2_dlmfs,ocfs2
ocfs2_nodemanager 123972 6 ocfs2_dlmfs,ocfs2,ocfs2_dlm
configfs 21520 2 ocfs2_nodemanager
# dmesg
...
OCFS2 Node Manager 1.3.3
OCFS2 DLM 1.3.3
OCFS2 1.3.3
kjournald starting. Commit interval 5 seconds
EXT3 FS on sda4, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
OCFS2 DLMFS 1.3.3
OCFS2 User DLM kernel interface loaded
...
# mount
...
none on /config type configfs (rw)
none on /dlm type ocfs2_dlmfs (rw)
# ls -l /config/
total 0
drwxr-xr-x 3 root root 0 Aug 23 00:21 cluster
------------------------------
serv_i386 # fsck /dev/sdb1
fsck 1.39 (29-May-2006)
Checking OCFS2 filesystem in /dev/sdb1:
label: <NONE>
uuid: 89 2e 82 95 3f 21 47 a4 bd 75 e2 aa c5 75 0b d3
number of blocks: 393214944
bytes per block: 4096
number of clusters: 12287967
bytes per cluster: 131072
max slots: 2
/dev/sdb1 is clean. It will be checked after 20 additional mounts.
----------cluster.conf------------
cluster:
node_count = 2
name = clust
node:
ip_port = 7777
ip_address = 19X.XXX.69.194
number = 0
name = serv_i386
cluster = clust
node:
ip_port = 7777
ip_address = 19X.XXX.69.196
number = 1
name = serv_x86_64
cluster = clust
****************************************************
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users